There are known problems with that version of k-means. Try using the trunk version. 0.3 is very close and we are entering code freeze for that so you should be fine with the latest version.
On Wed, Feb 24, 2010 at 5:46 PM, Arshad Khan <[email protected]>wrote: > Hello > > I am using Mahout 0.2 implementation of KMeans in one of my Text Mining > project. I apply KMeans with a default K value of 4. It seems that every > time I repeat the clustering process on the same data set, the results are > different and difference (in terms of cluster size and membership) is great > from run to run. The initial set of centroid points are chosen randomly > through RandomSeedGenerator. Is there a way to obtain more consistent > results that do not differ so greatly? Or may be I am doing something > wrong? > > Any help or idea is very much appreciated. > > Thanks and Regards > Arshad > -- Ted Dunning, CTO DeepDyve
