Hi, After Thomas implementation of K-Means (3) I was motivated to extend it using the Canopy clustering. So, I started looking at the MR implementation of Canopy (1) and (2). The MR implementation of Canopy clustering is done in two MR phases, first one to identify the canopies and second to assign canopies to the data points. I don't see much improvement when this is done using BSP. Please correct me if I am wrong.
Also, are there any algorithms which can implemented easily (for those who are getting started with Hama/BSP like me) on Hama/BSP where we could also see some performance improvements when compared to the MR implementation. I have seen Mahout and there are many algorithms implemented in it and would like to see something similar in Hama also. Thanks, Praveen (1) - http://horicky.blogspot.in/2011/04/k-means-clustering-in-map-reduce.html (2) - https://cwiki.apache.org/confluence/display/MAHOUT/Canopy+Clustering (3) - http://codingwiththomas.blogspot.in/2011/12/k-means-clustering-with-bsp-intuition.html
