On Wed, 3 Aug 2005, Prof Brian Ripley wrote: >> From the help page: > > 'clara' is fully described in chapter 3 of Kaufman and Rousseeuw > (1990). Compared to other partitioning methods such as 'pam', it > can deal with much larger datasets. Internally, this is achieved > by considering sub-datasets of fixed size ('sampsize') such that > the time and storage requirements become linear in n rather than > quadratic. > > and the default for 'sampsize' is apparently at least nrow(x).
Correction, sorry, in your case 40 + 2*k = 54. > So you need to set 'sampsize' (and perhaps 'samples') appropriately, That might be it, but a traceback() showing where the error is occurring would help. Another possible place is in the initial manipulations scaling the data matrix. Since sub-sampling is used, you can start with a much smaller subset of the data. > > > On Wed, 3 Aug 2005, Nestor Fernandez wrote: > >> Dear all, >> >> I'm trying to estimate clusters from a very large dataset using clara but >> the >> program stops with a memory error. The (very simple) code and the error: >> >> mydata<-read.dbf(file="fnorsel_4px.dbf") >> my.clara.7k<-clara(mydata,k=7) >> >>> Error: cannot allocate vector of size 465108 Kb >> >> The dataset contains >3,000,000 rows and 15 columns. I'm using a windows >> computer with 1.5G RAM; I also tried changing the memory limit to the >> maximum >> possible (4000M) > > Actually, the limit is probably 2048M: see the rw-FAQ Q on memory limits. > >> Is there a way to calculate clara clusters from such large datasets? > > -- > Brian D. Ripley, [EMAIL PROTECTED] > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html