Hi everyone,

I have a data.table called "data" with many columns which I want to group by column1 using data.table, given how fast it is.

The problem with looping a data.table is that data.table does not like quotations to define the column names (e.g. "col2" instead of col2). I found a way around which is to use get("col2"), which works fine but the processing time multiples by 20.

So if I use:

data[,sum(col2),by=(key)]

entering the column names by hand, the operation is done in 1 sec. but if in the contrary I use:

data[,sum(get("col2")),by=(key)]

using a loop to put the column names, the same operation takes 20 sec. I cannot use the former code because I have 100000 files to process but the later will simply take months to complete. Is there any alternative to the function "get" or any other way in which data.table con recognize the names of the columns?.

Thanks,
Camilo




Camilo Mora, Ph.D.
Department of Geography, University of Hawaii
Currently available in Colombia
Phone:   Country code: 57
         Provider code: 313
         Phone 776 2282
         From the USA or Canada you have to dial 011 57 313 776 2282
http://www.soc.hawaii.edu/mora/

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to