[
https://issues.apache.org/jira/browse/HAMA-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15292508#comment-15292508
]
Edward J. Yoon commented on HAMA-990:
-------------------------------------
{qoute}
According to [1] and [3], Apache Flink is faster than Spark in K-Means, Page
Rank and Query Processing whereas Spark is faster in Word Count. We can
reproduce these results in our cluster and then can calculate the results for
Hama. Once we have all the results we can compare all the systems.
{qoute}
I think good idea. With this, we may able to derive insight from the results
(this should be our goal). I think I heard that flink uses own serialization
techniques and shows good performance but unstable. Just FYI, MRQL also can be
used for K-Means and PageRank.
Regarding cluster, current my cluster (used for my research) is consist of only
few high-end machines equipped gpu and so somewhat not fit for large-scale
distributed computing benchmark. If you can write some scripts that make it
possible to auto-produce benchmark results on clouds such as Amazon or Google
cloud, I can help.
> GSoC'16: Apache Hama benchmark against Spark and Flink
> ------------------------------------------------------
>
> Key: HAMA-990
> URL: https://issues.apache.org/jira/browse/HAMA-990
> Project: Hama
> Issue Type: Documentation
> Reporter: Behroz Sikander
> Priority: Minor
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)