Re: [R] Grouping data in a data frame: is there an efficient way to do it?

David Winsemius Wed, 02 Sep 2009 16:00:38 -0700

table is reasonably fast. I have more than 4 X 10^6 records and a 2Dtable takes very little time:

nUA <- with (TRdta, table(URwbc, URrbc)) # both URwbc and URrbc arefactors

nUA


This does the same thing and took about 5 seconds just now:

xtabs( ~ URwbc + URrbc, data=TRdta)

On Sep 2, 2009, at 6:39 PM, Leo Alekseyev wrote:

I have a data frame with about 10^6 rows; I want to group the data
according to entries in one of the columns and do something with it.
For instance, suppose I want to count up the number of elements in
each group.  I tried something like aggregate(my.df$my.field,
list(my.df$my.field), length) but it seems to be very slow.  Likewise,
the split() function was slow (I killed it before it completed).  Is
there a way to efficiently accomplish this in R?..  I am almost
tempted to write an external Perl/Python script entering every row
into a hashtable keyed by my.field and iterating over the keys...
Might this be faster?..




David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Grouping data in a data frame: is there an efficient way to do it?

Reply via email to