Mahout parallel K-Means - algorithms analysis
To whom it may correspond, Hello, I have been checking the algorithm of Mahout 0.9 version k-means using MapReduce and I would like to know where can I check the code of what is happening inside the map function and in the reducer? I was debugging using NetBeans and I was not able to find what is exactly implemented in the Map and Reduce functions... The reason what I am doing this is because I would like to know what is exactly implemented in the version of Mahout 0.9 in order to see which parts where optimized on the K-Means mapReduce algorithm. Do you know which research paper the Mahout K-means was based on or where can I read the pseudo code? Thank you so much! Best regards! Hiroshi
Re: Mahout parallel K-Means - algorithms analysis
We would love to help. Can you say which program and which classes you are looking at? On Sat, Mar 15, 2014 at 12:58 PM, hiroshi leon hiroshi_8...@hotmail.comwrote: To whom it may correspond, Hello, I have been checking the algorithm of Mahout 0.9 version k-means using MapReduce and I would like to know where can I check the code of what is happening inside the map function and in the reducer? I was debugging using NetBeans and I was not able to find what is exactly implemented in the Map and Reduce functions... The reason what I am doing this is because I would like to know what is exactly implemented in the version of Mahout 0.9 in order to see which parts where optimized on the K-Means mapReduce algorithm. Do you know which research paper the Mahout K-means was based on or where can I read the pseudo code? Thank you so much! Best regards! Hiroshi
Re: Mahout parallel K-Means - algorithms analysis
The clustering code is cimapper and cireducer. Following the clustering, there is cluster classification which is mapper only. Not sure about the reference paper, this stuffs been around for long but the documentation for kmeans on mahout.apache.org should explain the approach. Sent from my iPhone On Mar 15, 2014, at 5:36 PM, hiroshi leon hiroshi_8...@hotmail.com wrote: Hello Ted, Thank you so much for your reply, the program that I was checking is the KMeansDriver class with the run function, the buildCluster function in the same class and following the ClusterIterator class with the iterateMR function. I would like to know how where can I check the code that is implemented for the mapper and the reducer? is it in the CIMappper.class and CIReducer.class? Is there a research paper or pseudo-code in which Mahout parallel K-means was based on? Thank you so much and have a nice day. Best regards From: ted.dunn...@gmail.com Date: Sat, 15 Mar 2014 13:56:56 -0700 Subject: Re: Mahout parallel K-Means - algorithms analysis To: user@mahout.apache.org We would love to help. Can you say which program and which classes you are looking at? On Sat, Mar 15, 2014 at 12:58 PM, hiroshi leon hiroshi_8...@hotmail.comwrote: To whom it may correspond, Hello, I have been checking the algorithm of Mahout 0.9 version k-means using MapReduce and I would like to know where can I check the code of what is happening inside the map function and in the reducer? I was debugging using NetBeans and I was not able to find what is exactly implemented in the Map and Reduce functions... The reason what I am doing this is because I would like to know what is exactly implemented in the version of Mahout 0.9 in order to see which parts where optimized on the K-Means mapReduce algorithm. Do you know which research paper the Mahout K-means was based on or where can I read the pseudo code? Thank you so much! Best regards! Hiroshi
RE: Mahout parallel K-Means - algorithms analysis
Hello Ted, Thank you so much for your reply, the program that I was checking is the KMeansDriver class with the run function, the buildCluster function in the same class and following the ClusterIterator class with the iterateMR function. I would like to know how where can I check the code that is implemented for the mapper and the reducer? is it in the CIMappper.class and CIReducer.class? Is there a research paper or pseudo-code in which Mahout parallel K-means was based on? Thank you so much and have a nice day. Best regards From: ted.dunn...@gmail.com Date: Sat, 15 Mar 2014 13:56:56 -0700 Subject: Re: Mahout parallel K-Means - algorithms analysis To: user@mahout.apache.org We would love to help. Can you say which program and which classes you are looking at? On Sat, Mar 15, 2014 at 12:58 PM, hiroshi leon hiroshi_8...@hotmail.comwrote: To whom it may correspond, Hello, I have been checking the algorithm of Mahout 0.9 version k-means using MapReduce and I would like to know where can I check the code of what is happening inside the map function and in the reducer? I was debugging using NetBeans and I was not able to find what is exactly implemented in the Map and Reduce functions... The reason what I am doing this is because I would like to know what is exactly implemented in the version of Mahout 0.9 in order to see which parts where optimized on the K-Means mapReduce algorithm. Do you know which research paper the Mahout K-means was based on or where can I read the pseudo code? Thank you so much! Best regards! Hiroshi