Just FYI, one of my friends said after reading this thread, "if Amazon
EC2 = MR or BSP, Google App Engine = Spark". Maybe usability side.

On Thu, Jun 25, 2015 at 8:46 AM, Edward J. Yoon <[email protected]> wrote:
> Hi, here's my few thoughts.
>
> Apache Spark is definitely more suited for ML (iterative algorithms) than
> legacy Hadoop due to its preservation of state and optimized execution
> strategy (RDDs). However, their approaches are still in synchronous iterative
> communication pattern.
>
> In Apache Hama case, it's a general-purpose pure BSP framework. While I admit
> that synchronization costs are high, the communication can be more efficiently
> realized with the message-passing BSP model. Moreover, BSP can have virtual
> shared memory and many more benefits. In addition, another one convincing
> point I think can  be a utilization ability of modern acceleration accessories
> such as InfiniBand and GPUs. I'm sure that this feature will bring a
> completely new wave of big data. The problem we faced is only a lack of
> interest to BSP programming model. :-)
>
>> 2) Do we have any recent benchmarks between the 2 systems ?
>
> It's in my todo list.
>
> --
> Best Regards, Edward J. Yoon
>
> -----Original Message-----
> From: Behroz Sikander [mailto:[email protected]]
> Sent: Thursday, June 25, 2015 12:57 AM
> To: [email protected]
> Subject: Hama vs Spark
>
> Hi,
> A few days back, I started reading about Apache Spark. It is a pretty good
> BigData platform. But a question arises to my mind that where Hama lies in
> comparison with Spark if we have to implement an iterative algorithm which
> is compute intensive (Machine learning or Optimization) ?
>
> I found some resources online but none answers my questions.
>
> 1)BSP vs MapReduce paper <http://arxiv.org/pdf/1203.2081v2.pdf>
> 2)
> https://people.apache.org/~edwardyoon/documents/Hama_BSP_for_Advanced_Analytics.pdf
> 3) I actually found the following benchmark but it is quite old.
>
> http://markmail.org/message/vyjsdpv355kua7rm#query:+page:1+mid:vstgda4fhmz52pdw+state:results
>
> Questions:
> 1) Is there any specific advantage when we chose BSP model instead of SPARK
> paradigm ?
> 2) Do we have any recent benchmarks between the 2 systems ?
> 3) What is the main convincing point to use Hama over Spark ?
> 4) Any scientific paper that compares both systems ? (I was not able to
> find any)
>
> Regards,
> Behroz Sikander
>
>



-- 
Best Regards, Edward J. Yoon

Reply via email to