On Wed, Apr 04, 2012 at 04:41:51PM -0700, Abhishek Pratap wrote: > Thanks Chris. So I guess the question becomes how can I efficiently > cluster 1 million x,y coordinates.
Did you try the scikit-learn's implementation of DBSCAN: http://scikit-learn.org/stable/modules/clustering.html#dbscan ? I am not sure that it scales, but it's worth trying. Alternatively, the best way to cluster massive datasets is to use the mini-batch implementation of KMeans: http://scikit-learn.org/stable/modules/clustering.html#mini-batch-k-means Hope this helps, Gael _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion