[jira] [Commented] (HAMA-983) Hama runner for DataFlow

JongYoon Lim (JIRA) Wed, 07 Dec 2016 17:33:40 -0800

    [ 
https://issues.apache.org/jira/browse/HAMA-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15730739#comment-15730739
 ]


JongYoon Lim commented on HAMA-983:
-----------------------------------

I added a link for skeleton code for hama-runner. 
Actually, I added TranslationContext class for executing batchjob. I mean 
results(supersteps) from translator are added to list in TranslationContext and 
after every translation, it executes each supersteps one by one. But when I add 
result(superstep), it's an object not class. So, I've just wondered if there is 
an easy way to create same object in grooms because those results(objects) are 
created on master. Also I wonder if this approach is correct or not.. 

> Hama runner for DataFlow
> ------------------------
>
>                 Key: HAMA-983
>                 URL: https://issues.apache.org/jira/browse/HAMA-983
>             Project: Hama
>          Issue Type: Bug
>            Reporter: Edward J. Yoon
>              Labels: gsoc2016
>
> As you already know, Apache Beam provides unified programming model for both 
> batch and streaming inputs.
> The APIs are generally associated with data filtering and transforming. So 
> we'll need to implement some data processing runner like 
> https://github.com/dapurv5/MapReduce-BSP-Adapter/blob/master/src/main/java/org/apache/hama/mapreduce/examples/WordCount.java
> Also, implementing similarity join can be funny. According to 
> http://www.ruizhang.info/publications/TPDS2015-Heads_Join.pdf, Apache Hama is 
> clearly winner among Apache Hadoop and Apache Spark.
> Since it consists of transformation, aggregation, and partition computations, 
> I think it's possible to implement using Apache Beam APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HAMA-983) Hama runner for DataFlow

Reply via email to