Hi there, 

If the final predicted clusters vary according to a random starting cluster 
then I suspect that your data is not clustering very well!! 
A few reasons for this may be: 

1) There are genuinely no clusters in the data!
2) You have chosen a poor distance measure.
3) You have picked an inappropriate number of clusters.

The basic goodness of fit of a cluster is that the variance within a cluster is 
small and the variance between clusters is large. 
Whenever I start to look for clusters I often use multidimensional scaling to 
look at the data in 2D! 

Lookup help(cmdscale)

If after this you wish to proceed, then I suggest you look up the 
library(cluster). 
The function silhouette is a nice tool to assess the appropriate number of 
clusters. 

Regards

Wayne


-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of "Julia Kröpfl"
Sent: 25 September 2007 10:01
To: R-help@r-project.org
Subject: [R] finding a stable cluster for kmeans


Hallo!

I applied kmeans to my data:

kcluster= kmeans((mydata, 4, iter.max=10)
table(code, kcluster$cluster)

If I run this code again, I get a different result as with the first trial (I 
understand that this is correct, since kmeans starts randomly with assigning 
the clusters and therefore the outcomes can be different)
But is there a way to stabilize the cluster (meaning finding the one cluster 
that appears the most often in 10 trials)?

Thank you for any ideas,
Julia 
--

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to