Hello I am using Mahout 0.2 implementation of KMeans in one of my Text Mining project. I apply KMeans with a default K value of 4. It seems that every time I repeat the clustering process on the same data set, the results are different and difference (in terms of cluster size and membership) is great from run to run. The initial set of centroid points are chosen randomly through RandomSeedGenerator. Is there a way to obtain more consistent results that do not differ so greatly? Or may be I am doing something wrong?
Any help or idea is very much appreciated. Thanks and Regards Arshad
