[R] more complex by with data.table???

2015-06-09 Thread Ramiro Barrantes
Hello,

I am trying to do something that I am able to do with the "by" function within 
data.frame but can't figure out how to achieve with data.table.

Consider

dt<-data.table(name=c(rep("a",5),rep("b",6)),var1=0:10,var2=20:30,var3=40:50)
myFunction <- function(x) { mean(x) }

I am aware that I can do something like:

dt[, .(meanVar1=myFunction(var1)) ,by=.(name)]

but how could I do the equivalent of:

df<-data.frame(name=c(rep("a",5),rep("b",6)),var1=0:10,var2=20:30,var3=40:50)
myFunction <- function(x) { mean(x) }

columnNames <- c("var1","var2","var3")
result <- by(df, df$name, function(x) {
   output <- c()
   for(col in columnNames) {
 output[col] <- myFunction(x[,col])
   }
  output
})
do.call(rbind,result)

Thanks in advance,
Ramiro

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] more complex by with data.table???

2015-06-09 Thread jim holtman
try this:

> dt[
+ , {
+ result <- list()
+ for (i in names(.SD)){
+ result[[i]] <- myFunction(unlist(.SD[, i, with = FALSE]))
+ }
+ result
+   }
+ , by = name
+ ]
   name var1 var2 var3
1:a  2.0   22   42
2:b  7.5   28   48
>



Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Tue, Jun 9, 2015 at 4:22 PM, Ramiro Barrantes <
ram...@precisionbioassay.com> wrote:

> Hello,
>
> I am trying to do something that I am able to do with the "by" function
> within data.frame but can't figure out how to achieve with data.table.
>
> Consider
>
>
> dt<-data.table(name=c(rep("a",5),rep("b",6)),var1=0:10,var2=20:30,var3=40:50)
> myFunction <- function(x) { mean(x) }
>
> I am aware that I can do something like:
>
> dt[, .(meanVar1=myFunction(var1)) ,by=.(name)]
>
> but how could I do the equivalent of:
>
>
> df<-data.frame(name=c(rep("a",5),rep("b",6)),var1=0:10,var2=20:30,var3=40:50)
> myFunction <- function(x) { mean(x) }
>
> columnNames <- c("var1","var2","var3")
> result <- by(df, df$name, function(x) {
>output <- c()
>for(col in columnNames) {
>  output[col] <- myFunction(x[,col])
>}
>   output
> })
> do.call(rbind,result)
>
> Thanks in advance,
> Ramiro
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] more complex by with data.table???

2015-06-09 Thread Ista Zahn
Hi Ramiro,

There is a demonstration of this on the data.table wiki at
https://rawgit.com/wiki/Rdatatable/data.table/vignettes/datatable-intro-vignette.html.
You can do

dt[, lapply(.SD, mean), by=name]

or

dt[, as.list(colMeans(.SD)), by=name]

BTW, there are pretty straightforward ways to do this in base R as well, e.g,

data.frame(t(sapply(split(df[-1], df$name), colMeans)))

Best,
Ista

On Tue, Jun 9, 2015 at 4:22 PM, Ramiro Barrantes
 wrote:
> Hello,
>
> I am trying to do something that I am able to do with the "by" function 
> within data.frame but can't figure out how to achieve with data.table.
>
> Consider
>
> dt<-data.table(name=c(rep("a",5),rep("b",6)),var1=0:10,var2=20:30,var3=40:50)
> myFunction <- function(x) { mean(x) }
>
> I am aware that I can do something like:
>
> dt[, .(meanVar1=myFunction(var1)) ,by=.(name)]
>
> but how could I do the equivalent of:
>
> df<-data.frame(name=c(rep("a",5),rep("b",6)),var1=0:10,var2=20:30,var3=40:50)
> myFunction <- function(x) { mean(x) }
>
> columnNames <- c("var1","var2","var3")
> result <- by(df, df$name, function(x) {
>output <- c()
>for(col in columnNames) {
>  output[col] <- myFunction(x[,col])
>}
>   output
> })
> do.call(rbind,result)
>
> Thanks in advance,
> Ramiro
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] more complex by with data.table???

2015-06-21 Thread Arunkumar Srinivasan
Ramiro,

`dt[, lapply(.SD, mean), by=name]` is the idiomatic way.

I suggest reading through the new HTML vignettes at
https://github.com/Rdatatable/data.table/wiki/Getting-started

Ista, thanks for linking to the new vignette.


On Wed, Jun 10, 2015 at 2:17 AM, Ista Zahn  wrote:
> Hi Ramiro,
>
> There is a demonstration of this on the data.table wiki at
> https://rawgit.com/wiki/Rdatatable/data.table/vignettes/datatable-intro-vignette.html.
> You can do
>
> dt[, lapply(.SD, mean), by=name]
>
> or
>
> dt[, as.list(colMeans(.SD)), by=name]
>
> BTW, there are pretty straightforward ways to do this in base R as well, e.g,
>
> data.frame(t(sapply(split(df[-1], df$name), colMeans)))
>
> Best,
> Ista
>
> On Tue, Jun 9, 2015 at 4:22 PM, Ramiro Barrantes
>  wrote:
>> Hello,
>>
>> I am trying to do something that I am able to do with the "by" function 
>> within data.frame but can't figure out how to achieve with data.table.
>>
>> Consider
>>
>> dt<-data.table(name=c(rep("a",5),rep("b",6)),var1=0:10,var2=20:30,var3=40:50)
>> myFunction <- function(x) { mean(x) }
>>
>> I am aware that I can do something like:
>>
>> dt[, .(meanVar1=myFunction(var1)) ,by=.(name)]
>>
>> but how could I do the equivalent of:
>>
>> df<-data.frame(name=c(rep("a",5),rep("b",6)),var1=0:10,var2=20:30,var3=40:50)
>> myFunction <- function(x) { mean(x) }
>>
>> columnNames <- c("var1","var2","var3")
>> result <- by(df, df$name, function(x) {
>>output <- c()
>>for(col in columnNames) {
>>  output[col] <- myFunction(x[,col])
>>}
>>   output
>> })
>> do.call(rbind,result)
>>
>> Thanks in advance,
>> Ramiro
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.