> From: [EMAIL PROTECTED] > > On 25-Nov-04 Ted Harding wrote: > > 'unique' will eat x for breakfast, indeed, but will have some > > trouble chewing (x,y). > > > > I still can't think of a neat way of doing that. > > > > Best wishes, > > Ted. > > Sorry, I don't want to be misunderstood. > I didn't mean that 'unique' won't work for arrays. > What I meant was: > > > X<-round(rnorm(1e6),3);Y<-round(rnorm(1e6),3) > > system.time(unique(X)) > [1] 0.74 0.07 0.81 0.00 0.00 > > system.time(unique(cbind(X,Y))) > [1] 350.81 4.56 356.54 0.00 0.00
Do you know if majority of that time is spent in unique() itself? If so, which method? What I see is: > X<-round(rnorm(1e6),3);Y<-round(rnorm(1e6),3) > system.time(unique(X), gcFirst=TRUE) [1] 0.25 0.01 0.26 NA NA > system.time(unique(cbind(X,Y)), gcFirst=TRUE) [1] 101.80 0.34 104.61 NA NA > system.time(dat <- data.frame(x=X, y=Y), gcFirst=TRUE) [1] 10.17 0.00 10.24 NA NA > system.time(unique(dat), gcFirst=TRUE) [1] 23.94 0.11 24.15 NA NA Andy > However, still rounding to 3 d.p. we can try packing: > > > Z<-100000000*X + 1000*Y > > system.time(W<-unique(Z)) > [1] 0.83 0.05 0.88 0.00 0.00 > > length(W) > [1] 961523 > > Though the runtime is small we don't get much reduction > and still W has to be unpacked. > > With rounding to 2 d.p. > > > X<-round(rnorm(1e6),2);Y<-round(rnorm(1e6),2) > > Z<-100000000*X + 1000*Y > > system.time(W<-unique(Z)) > [1] 1.31 0.01 1.32 0.00 0.00 > > length(W) > [1] 209882 > > so now it's about 1/5, but visible discretisation must be > getting close. > > With 1 d.p. > > > X<-round(rnorm(1e6),1);Y<-round(rnorm(1e6),1) > > Z<-100000000*X + 1000*Y > > system.time(W<-unique(Z)) > [1] 0.92 0.01 0.93 0.00 0.00 > > length(W) > [1] 4953 > > there's a good reduction (about 1/200) but the discretisation > would definitely now be visible. However, as I suggested before, > there's an issue of choice of constant (i.e. of the resolution > of the discretisation so that there's a useful reduction and > also the plot is acceptable). > > I'd still like to learn of a method which avoids the > above method of packing, which strikes me as clumsy > (but maybe it's the best way after all). > > Ted. > > > -------------------------------------------------------------------- > E-Mail: (Ted Harding) <[EMAIL PROTECTED]> > Fax-to-email: +44 (0)870 094 0861 [NB: New number!] > Date: 25-Nov-04 Time: 01:45:48 > ------------------------------ XFMail ------------------------------ > > ______________________________________________ > [EMAIL PROTECTED] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > ______________________________________________ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html