Re: [R] problem of data manipulation

2010-01-20 Thread Matthew Dowle
The user wrote in their first post : I have a lot of observations in my dataset Heres one way to do it with a data.table : a=data.table(a) ans = a[ , list(dt=dt[dt-min(dt)7]) , by=var1,var2,var3] class(ans$dt) = Date Timings are below comparing the 3 methods. In this

Re: [R] problem of data manipulation

2010-01-20 Thread hadley wickham
Note that in the documentaton ?[.data.table where I say that 'by' is slow, I mean relative to how fast it could be.  Its seems, in this specific example anyway, and with the code posted so far, to be significantly faster than sqldf and plyr. Of course the best of both worlds would be to use

Re: [R] problem of data manipulation

2010-01-20 Thread Matthew Dowle
Sounds like a good idea. Would it be possible to give an example of how to combine plyr with data.table, and why that is better than a data.table only solution ? hadley wickham h.wick...@gmail.com wrote in message news:f8e6ff051001200624r2175e38xf558dc8fa3fb6...@mail.gmail.com... Note that in

Re: [R] problem of data manipulation

2010-01-20 Thread hadley wickham
On Wed, Jan 20, 2010 at 8:43 AM, Matthew Dowle mdo...@mdowle.plus.com wrote: Sounds like a good idea. Would it be possible to give an example of how to combine plyr with data.table, and why that is better than a data.table only solution ? Well, ideally, you'd do: adt - data.table(a) ans2 -

Re: [R] problem of data manipulation

2010-01-20 Thread Matthew Dowle
I see now, thanks for explaining that. Would it be for you to add data.table methods to ddply then, for this to happen? Or does a ddply method need to be added to data.table? hadley wickham h.wick...@gmail.com wrote in message

Re: [R] problem of data manipulation

2010-01-19 Thread hadley wickham
On Mon, Jan 18, 2010 at 1:54 PM, Bert Gunter gunter.ber...@gene.com wrote: One way to do it: 1. Convert your date column to the Date class using the as.Date() function. This allows you to do the necessary arithmetic on the dates below. dt - as.Date(a[,4],%d/%m/%Y) 2. Create a factor out of

Re: [R] problem of data manipulation

2010-01-19 Thread Gabor Grothendieck
Using data frame, a, from the post below this is how it would be done in SQL using sqldf. We join together the original table, a, with a table of minimums (computed by the nested select) and then choose only the rows where dt - mindt 7 (in the where clause). library(sqldf) sqldf(select var1,

[R] problem of data manipulation

2010-01-18 Thread rusers.sh
Hello, See my problem below. a-data.frame(c(s,c,c,n,n,n),c(rep(1,3),rep(2,3)),c(rep(2,3),rep(1,3)),c(01/01/1999,10/02/2000,13/02/2000,11/02/2000,15/02/2000,23/02/2000)) colnames(a)-c(var1,var2,var3,var4) a var1 var2 var3 var4 1s1201/01/1999 2c1210/02/2000

Re: [R] problem of data manipulation

2010-01-18 Thread Bert Gunter
Sent: Monday, January 18, 2010 10:40 AM To: r-help@r-project.org Subject: [R] problem of data manipulation Hello, See my problem below. a-data.frame(c(s,c,c,n,n,n),c(rep(1,3),rep(2,3)),c(rep(2,3),rep (1,3)),c(01/01/1999,10/02/2000,13/02/2000,11/02/2000,15/02/2000,2 3/02/2000)) colnames(a)-c(var1

Re: [R] problem of data manipulation

2010-01-18 Thread William Dunlap
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter Sent: Monday, January 18, 2010 11:54 AM To: 'rusers.sh'; r-help@r-project.org Subject: Re: [R] problem of data manipulation One way to do it: 1. Convert your

Re: [R] problem of data manipulation

2010-01-18 Thread Bert Gunter
:15 PM To: Bert Gunter; rusers.sh; r-help@r-project.org Subject: Re: [R] problem of data manipulation -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter Sent: Monday, January 18, 2010 11:54 AM To: 'rusers.sh'; r-help@r

Re: [R] problem of data manipulation

2010-01-18 Thread William Dunlap
-Original Message- From: Bert Gunter [mailto:gunter.ber...@gene.com] Sent: Monday, January 18, 2010 12:32 PM To: William Dunlap; 'rusers.sh'; r-help@r-project.org Subject: RE: [R] problem of data manipulation Absolutely... so long as you assume the dates are in order

Re: [R] problem of data manipulation

2010-01-18 Thread rusers.sh
Sent: Monday, January 18, 2010 12:15 PM To: Bert Gunter; rusers.sh; r-help@r-project.org Subject: Re: [R] problem of data manipulation -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter Sent: Monday

Re: [R] problem of data manipulation

2010-01-18 Thread rusers.sh
...@tibco.com -Original Message- From: Bert Gunter [mailto:gunter.ber...@gene.com] Sent: Monday, January 18, 2010 12:32 PM To: William Dunlap; 'rusers.sh'; r-help@r-project.org Subject: RE: [R] problem of data manipulation Absolutely... so long as you assume the dates are in order

Re: [R] problem of data manipulation

2010-01-18 Thread Bert Gunter
Gunter; r-help@r-project.org Subject: Re: [R] problem of data manipulation I just remembered that my actual dataset for var2 and var3 are numerical data,e.g. 12.34, not factors. The above example data is misleading.   Suppose var2 and var3 are numerical variables, not factors. How should we do