[jira] [Commented] (HAMA-990) GSoC'16: Apache Hama benchmark against Spark and Flink

Edward J. Yoon (JIRA) Thu, 19 May 2016 18:24:12 -0700

    [ 
https://issues.apache.org/jira/browse/HAMA-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15292508#comment-15292508
 ]


Edward J. Yoon commented on HAMA-990:
-------------------------------------

{qoute}
According to [1] and [3], Apache Flink is faster than Spark in K-Means, Page 
Rank and Query Processing whereas Spark is faster in Word Count. We can 
reproduce these results in our cluster and then can calculate the results for 
Hama. Once we have all the results we can compare all the systems.
{qoute}

I think good idea. With this, we may able to derive insight from the results 
(this should be our goal). I think I heard that flink uses own serialization 
techniques and shows good performance but unstable. Just FYI, MRQL also can be 
used for K-Means and PageRank.

Regarding cluster, current my cluster (used for my research) is consist of only 
few high-end machines equipped gpu and so somewhat not fit for large-scale 
distributed computing benchmark. If you can write some scripts that make it 
possible to auto-produce benchmark results on clouds such as Amazon or Google 
cloud, I can help.


> GSoC'16: Apache Hama benchmark against Spark and Flink
> ------------------------------------------------------
>
>                 Key: HAMA-990
>                 URL: https://issues.apache.org/jira/browse/HAMA-990
>             Project: Hama
>          Issue Type: Documentation
>            Reporter: Behroz Sikander
>            Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HAMA-990) GSoC'16: Apache Hama benchmark against Spark and Flink

Reply via email to