Hi Andy,

On 25-Nov-04 Liaw, Andy wrote:
>> From: [EMAIL PROTECTED]
>> [...]
>> > X<-round(rnorm(1e6),3);Y<-round(rnorm(1e6),3)
>> > system.time(unique(X))
>> [1] 0.74 0.07 0.81 0.00 0.00
>> > system.time(unique(cbind(X,Y)))
>> [1] 350.81   4.56 356.54   0.00   0.00
> 
> Do you know if majority of that time is spent in unique() itself?
>  If so, which method?  What I see is:
> 
>> X<-round(rnorm(1e6),3);Y<-round(rnorm(1e6),3)
>> system.time(unique(X), gcFirst=TRUE)
> [1] 0.25 0.01 0.26   NA   NA
>> system.time(unique(cbind(X,Y)), gcFirst=TRUE)
> [1] 101.80   0.34 104.61     NA     NA
>> system.time(dat <- data.frame(x=X, y=Y), gcFirst=TRUE)
> [1] 10.17  0.00 10.24    NA    NA
>> system.time(unique(dat), gcFirst=TRUE)
> [1] 23.94  0.11 24.15    NA    NA
> 
> Andy

I want to look into this a bit more systematically (I have
an idea why 'unique' may be taking longer on the array from
'cbind' than on the dataframe), but I will be doing this on
a much faster machine than I immediately have to hand, so
will report results (if interesting) later.

Meanwhile, I'm not sure what you mean by "which method?",
and I'm also wondering what "gcFirst" is about.

Thanks,
Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <[EMAIL PROTECTED]>
Fax-to-email: +44 (0)870 094 0861  [NB: New number!]
Date: 25-Nov-04                                       Time: 14:30:39
------------------------------ XFMail ------------------------------

______________________________________________
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to