Your request for a more general approach is precisely the reason that
Hadley Wickham wrote the plyr package. He describes a split-apply-
combine strategy for a variety of data structures and tools to
implement those strategies here:
http://had.co.nz/plyr/plyr-intro-090510.pdf
The argument to the "by" stp is a column name rather than a list or
object as it would be in tapply or split. I is just the identity
function which doubles for return(x) in your code.
library(plyr)
> ddply(y, "month", fun=I)
suid month esr
1 1074034 1 2
2 1074034 1 1
3 1074034 1 2
4 1074034 1 9
5 1123003 1 2
6 1074034 2 2
7 1074034 2 1
8 1074034 2 2
9 1074034 2 9
10 1123003 2 2
11 1074034 3 2
12 1074034 3 1
13 1074034 3 2
14 1074034 3 9
15 1123003 3 2
16 1074034 12 6
17 1074034 12 1
18 1074034 12 2
19 1074034 12 9
20 1123003 12 2
On Jun 24, 2009, at 11:34 PM, Stephan Lindner wrote:
Dear all,
I have a code where I subset a data frame to match entries within
levels of an factor (actually, the full script uses three difference
factors do do that). I'm very happy with the precision with which I
can
work with R, but since I loop over factor levels, and the data frame
is
big, the process is slow. So I've been trying to speed up the process
using by(), but I got stuck at the point where I want to stack back
the sub- data frames, and I was wondering whether someone could help
me
out.
Here is an example:
<--
y <- data.frame(suid = c(rep(1074034,16),rep(1123003,4)),
month = rep(c(12,1,2,3),5),
esr = c(6,2,2,2,1,1,1,1,2,2,2,2,9,9,9,9,2,2,2,2))
by(y,y$month,function(x)return(x))
y$month: 1
suid month esr
2 1074034 1 2
6 1074034 1 1
10 1074034 1 2
14 1074034 1 9
18 1123003 1 2
------------------------------------------------------------
y$month: 2
suid month esr
3 1074034 2 2
7 1074034 2 1
11 1074034 2 2
15 1074034 2 9
19 1123003 2 2
------------------------------------------------------------
y$month: 3
suid month esr
4 1074034 3 2
8 1074034 3 1
12 1074034 3 2
16 1074034 3 9
20 1123003 3 2
------------------------------------------------------------
y$month: 12
suid month esr
1 1074034 12 6
5 1074034 12 1
9 1074034 12 2
13 1074034 12 9
17 1123003 12 2
-->
What I would like to do is stacking these four data frames back to one
data frame, which in this simple example would just be y. I tried
unlist(), unclass() and rbind(), but none of them would work.
Thanks a lot,
Stephan
--
-----------------------
Stephan Lindner
University of Michigan
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.