Re: [R] Using by() and stacking back sub-data frames to one data frame

David Winsemius Thu, 25 Jun 2009 06:09:38 -0700

Your request for a more general approach is precisely the reason thatHadley Wickham wrote the plyr package. He describes a split-apply-combine strategy for a variety of data structures and tools toimplement those strategies here:


http://had.co.nz/plyr/plyr-intro-090510.pdf

The argument to the "by" stp is a column name rather than a list orobject as it would be in tapply or split. I is just the identityfunction which doubles for return(x) in your code.


library(plyr)
> ddply(y, "month", fun=I)
      suid month esr
1  1074034     1   2
2  1074034     1   1
3  1074034     1   2
4  1074034     1   9
5  1123003     1   2
6  1074034     2   2
7  1074034     2   1
8  1074034     2   2
9  1074034     2   9
10 1123003     2   2
11 1074034     3   2
12 1074034     3   1
13 1074034     3   2
14 1074034     3   9
15 1123003     3   2
16 1074034    12   6
17 1074034    12   1
18 1074034    12   2
19 1074034    12   9
20 1123003    12   2

On Jun 24, 2009, at 11:34 PM, Stephan Lindner wrote:

Dear all,


I have a code where I subset a data frame to match entries within
levels of an factor (actually, the full script uses three difference

factors do do that). I'm very happy with the precision with which Icanwork with R, but since I loop over factor levels, and the data frameis

big, the process is slow. So I've been trying to speed up the process
using by(), but I got stuck at the point where I want to stack back

the sub- data frames, and I was wondering whether someone could helpme

out.

Here is an example:

<--

y <- data.frame(suid  = c(rep(1074034,16),rep(1123003,4)),

                month = rep(c(12,1,2,3),5),
                esr   = c(6,2,2,2,1,1,1,1,2,2,2,2,9,9,9,9,2,2,2,2))

by(y,y$month,function(x)return(x))


y$month: 1
     suid month esr
2  1074034     1   2
6  1074034     1   1
10 1074034     1   2
14 1074034     1   9
18 1123003     1   2
------------------------------------------------------------
y$month: 2
     suid month esr
3  1074034     2   2
7  1074034     2   1
11 1074034     2   2
15 1074034     2   9
19 1123003     2   2
------------------------------------------------------------
y$month: 3
     suid month esr
4  1074034     3   2
8  1074034     3   1
12 1074034     3   2
16 1074034     3   9
20 1123003     3   2
------------------------------------------------------------
y$month: 12
     suid month esr
1  1074034    12   6
5  1074034    12   1
9  1074034    12   2
13 1074034    12   9
17 1123003    12   2

-->

What I would like to do is stacking these four data frames back to one
data frame, which in this simple example would just be y. I tried
unlist(), unclass() and rbind(), but none of them would work.


Thanks a lot,



        Stephan










--
-----------------------
Stephan Lindner
University of Michigan

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using by() and stacking back sub-data frames to one data frame

Reply via email to