Hi Joe, great thank you very much for clarification. I love classification algorithms, so I'll be very interested in how you develop this. "Per se" you can translate every MapReduce algorithm to BSP, since BSP is an abstraction to MapReduce. E.G: Map Phase is a local computation phase, merging and sorting are the synchronization barrier (needs finish of all map tasks) and reducing is a computational phase again. On the english wikipedia is a good schema that shows how the workflow is.
Actually you can make your map phase as well in BSP, but for the latest release 0.3.0 you have to write data sharding and partitioning for yourself. There are examples and blog posts that shows how code them. Your reduce step is depending on your implementation. Is there a single reducer which updates the whole classifier? Actually I wanted to implement a k-means clustering in BSP, but sadly I was very busy and have not too much time for it. It is quite similar to your algorithm. The Map step is calculating the distance between the current point and the centers and the reducer is going to update the centers. To provide you with a bit of information, I already rewritten an MapReduce graph algorithm to BSP. [1][2] These examples are without partitioning, I recently did an improvement to the partitioning algorithm. So it makes sense to checkout the current trunk and browse through the graph package and examples package. It contains improved partitioning as well as graph examples. HF and GL. If you need help, I'll be glad to help you. [1] http://codingwiththomas.blogspot.com/2011/04/graph-exploration-with-hadoop-mapreduce.html [2] http://codingwiththomas.blogspot.com/2011/04/graph-exploration-using-apache-hama-and.html 2011/9/2 Zhiyong Xie <[email protected]> > Thank you Thomas! In short, SVM ( > http://en.wikipedia.org/wiki/Support_vector_machine) is a supervised > learning classifier described as optimization problem and solved by > gradient > descent approach (http://en.wikipedia.org/wiki/Stochastic_gradient_descent > ). > It is a iterative process, and kinda run a map/reduce pair per iteration. > Map to calculate the gradient value for each point, and reduce phase to > optimize the classifier. BSP model seems native for scientific and graph > processing in my mind, not figure out or find much info online for this > type > of application so far . > > Best, > Joe > > On Thu, Sep 1, 2011 at 10:36 AM, Thomas Jungblut < > [email protected]> wrote: > > > Hi Joe, > > > > for non-insiders, would you please clarify what SGD and SVM are? > > Then we could give you some tips how to implement them in BSP. > > > > Greetz, > > Thomas > > > > 2011/9/1 Zhiyong Xie <[email protected]> > > > > > Hi there, > > > > > > May I ask whether anyone else have look into the SGD mapping on BSP > model > > > too? I'm investigating whether BSP model is a good candidate for > > > implementing distributed version of SVM SGD. > > > > > > Thanks! > > > Joe > > > -- > > > Joe (Zhiyong) Xie > > > Graduate Student > > > > > > > > > > > -- > > Thomas Jungblut > > Berlin > > > > mobile: 0170-3081070 > > > > business: [email protected] > > private: [email protected] > > > > > > -- > Joe (Zhiyong) Xie > Graduate Student > -- Thomas Jungblut Berlin mobile: 0170-3081070 business: [email protected] private: [email protected]
