Re: compared with MapReduce ,what is the advantage of HAMA?

Thomas Jungblut Sat, 24 Sep 2011 02:45:01 -0700

Thanks for your tips, I transfer this to our dev-list for discussion.

2011/9/24 changguanghui <[email protected]>


> I think，maybe, It is important to find some algorithm or some problem which
> is more suitable for using HAMA. Then, people can observe the contrast to
> the results between HAMA and MapReduce. Because more people want to know why
> they should choose HAMA, when they should choose HAMA.....
>
>  -----邮件原件-----
> 发件人: Thomas Jungblut [mailto:[email protected]]
> 发送时间: 2011年9月23日 19:39
> 收件人: [email protected]
> 主题: Re: compared with MapReduce ,what is the advantage of HAMA?
>
> Hi,
> to clearly state the advantage: you have less overhead.
> Let me illustrate an algorithm for mindist search, I renamed it to graph
> exploration. This will apply on Shortest Paths, too.
> I wrote about it here:
>
> http://codingwiththomas.blogspot.com/2011/04/graph-exploration-with-hadoop-mapreduce.html
>
> Basically the algorithm groups the components of the graph and assigns the
> lowest key of the group as an identifier for the component.
> Usually you are solving graph problems with MapReduce with a technique
> called "Message Passing".
> So you are going to send messages to other vertices in every map step. Then
> you have to shuffle, sort and reduce the vertices to compute the result.
> This isn't done with a single iteration, so you have to chain several
> map/reduce jobs.
>
> For each iteration you inherit the overhead of sorting and shuffeling.
> Additional you have to do this on the disk.
>
> Hama provides a message passing interface, so you don't have to take care
> of
> writing each message to HDFS.
> Each iteration, which is in MapReduce a full job execution, is called a
> superstep in BSP.
> Each superstep is faster than a full job execution in Hadoop, because you
> don't have the overhead with spilling to disk, job setup, sorting and
> shuffeling.
> In addition you can put your whole graph into RAM, this will speed up the
> computation anyways. Hadoop does not offer this capability yet.
>
> But I want to point out some facts that are not positive though:
> Currently no benchmarks against Hadoop or other Frameworks like Giraph or
> GoldenORB exist, so we can't say: we are the best/fastest/coolest.
> And graph algorithms are a hard way to code. As you can see, I have written
> lots of code to get this running. That is because I have to take care of
> the
> partitioning, vertex messaging and IO stuff by myself.
> For that purpose we are going to release a Pregel API which makes the
> development of graph algorithms a lot more easier.
> You can get a sneak peek here:
> https://issues.apache.org/jira/browse/HAMA-409
>
> That was a lot of text, but I hope to clarify a lot.
>
> Best Regards,
> Thomas
>
> 2011/9/23 changguanghui <[email protected]>
>
> > Hi Thomas,
> >
> > Could you provide a concrete instance to illustrate the advantage of
> HAMA，
> > when HAMA vs. MapReduce?
> >
> > For example，SSSP on HAMA vs. SSSP on MapReduce. So ,I can catch the idea
> of
> > HAMA quickly.
> >
> > Thank you very much!
> >
> > Changguanghui
> >
>



-- 
Thomas Jungblut
Berlin

mobile: 0170-3081070

business: [email protected]
private: [email protected]

Re: compared with MapReduce ,what is the advantage of HAMA?

Reply via email to