U could call ClusterQualitySummarizer which then calls ClusteringUtils to spew 
out the different metrics u had specified.
For an example, see the Streaming Kmeans section in 
examples/bin/cluster-reuters.sh.  

It calls 'qualcluster' with options -i <tf-idf vectors generated from 
seq2sparse> -c <output of Kmeans> -o <output file generated with the metrics>


I have not tried this on KMeans and since the output format of KMeans is 
different from Streaming KMeans, this might just fall flat.
Also it may fail to read some of the clusters if the clusters have only a 
single clusteredpoint, this is due to new TDigest summarizer that expects 
atleast 2 points in order to calculate - max, quartiles, mean.


 





On Sunday, March 9, 2014 4:19 AM, Bikash Gupta <bikash.gupt...@gmail.com> wrote:
 
Hi,

I want to use ClusteringUtils on Kmeans clusteredPoints to get
summarizeClusterDistances , daviesBouldinIndex & dunnIndex

Is there any sample or example how to use these features?
-- 
Thanks & Regards
Bikash Kumar Gupta

Reply via email to