Also in my case I dont really have a good approximate on value of K in K-means.
-A On Thu, Apr 5, 2012 at 8:06 AM, Abhishek Pratap <apra...@lbl.gov> wrote: > Hi Gael > > The MemoryError exception I am getting is from using scikit's DBSCAN > implementation. I can check mini-batch implementation of Kmeans. > > Best, > -Abhi > > On Wed, Apr 4, 2012 at 10:33 PM, Gael Varoquaux > <gael.varoqu...@normalesup.org> wrote: >> On Wed, Apr 04, 2012 at 04:41:51PM -0700, Abhishek Pratap wrote: >>> Thanks Chris. So I guess the question becomes how can I efficiently >>> cluster 1 million x,y coordinates. >> >> Did you try the scikit-learn's implementation of DBSCAN: >> http://scikit-learn.org/stable/modules/clustering.html#dbscan >> ? I am not sure that it scales, but it's worth trying. >> >> Alternatively, the best way to cluster massive datasets is to use the >> mini-batch implementation of KMeans: >> http://scikit-learn.org/stable/modules/clustering.html#mini-batch-k-means >> >> Hope this helps, >> >> Gael >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion