The original question was about a matrix, not a vector and this is much slower: x <- sample(100000, size=13584763, replace=T) dim(x) <- c(13584763, 1) system.time(unique(x)) So the solution would be: unique(as.vector(x))
>>> From: Duncan Murdoch <murdoch.dun...@gmail.com> To:G FANG <fanggan...@gmail.com> CC:<r-help@r-project.org> Date: 22/Jun/2010 1:20p Subject: Re: [R] how to efficiently compute set unique? On 21/06/2010 9:06 PM, G FANG wrote: > Hi, > > I want to get the unique set from a large numeric k by 1 vector, k is > in tens of millions > > when I used the matlab function unique, it takes less than 10 secs > > but when I tried to use the unique in R with similar CPU and memory, > it is not done in minutes > > I am wondering, am I using the function in the right way? > > dim(cntxtn) > [1] 13584763 1 > uniqueCntxt = unique(cntxtn); # this is taking really long What type is cntxtn? If I do that sort of thing on a numeric vector, it's quite fast: > x <- sample(100000, size=13584763, replace=T) > system.time(unique(x)) user system elapsed 3.61 0.14 3.75 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R ( http://www.r/ )-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.