Hi, A few days back, I started reading about Apache Spark. It is a pretty good BigData platform. But a question arises to my mind that where Hama lies in comparison with Spark if we have to implement an iterative algorithm which is compute intensive (Machine learning or Optimization) ?
I found some resources online but none answers my questions. 1)BSP vs MapReduce paper <http://arxiv.org/pdf/1203.2081v2.pdf> 2) https://people.apache.org/~edwardyoon/documents/Hama_BSP_for_Advanced_Analytics.pdf 3) I actually found the following benchmark but it is quite old. http://markmail.org/message/vyjsdpv355kua7rm#query:+page:1+mid:vstgda4fhmz52pdw+state:results Questions: 1) Is there any specific advantage when we chose BSP model instead of SPARK paradigm ? 2) Do we have any recent benchmarks between the 2 systems ? 3) What is the main convincing point to use Hama over Spark ? 4) Any scientific paper that compares both systems ? (I was not able to find any) Regards, Behroz Sikander
