On Tue, Aug 28, 2012 at 9:48 AM, dexter morgan <dextermorga...@gmail.com>wrote:
> > I understand your solution ( i think) , didn't think of that, in that > particular way. > I think that lets say i have 1M data-points, and running knn , that the > k=1M and n=10 (each point is a cluster that requires up to 10 points) > is an overkill. > I am not sure I understand you. n = number of points. k = number of clusters. For searching 1 million points, I would recommend thousands of clusters. > How can i achieve the same result WITHOUT using mahout, just running on > the dataset , i even think it'll be in the same complexity (o(n^2)) > Running with a good knn package will give you roughly O(n log n) complexity.