[ 
https://issues.apache.org/jira/browse/SAMZA-307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14045587#comment-14045587
 ] 

Yan Fang commented on SAMZA-307:
--------------------------------

Oops, sorry for generating so many changing notification...(I did not realize 
the JIRA is so sensitive...)

Yes, I feel providing a deploy script to automate the process will definitely 
help run job more smoothly.

> Simplify YARN deploy procedure 
> -------------------------------
>
>                 Key: SAMZA-307
>                 URL: https://issues.apache.org/jira/browse/SAMZA-307
>             Project: Samza
>          Issue Type: Improvement
>            Reporter: Yan Fang
>
> Currently, we have two ways of deploying the samza job to YARN cluster, from 
> [HDFS|https://samza.incubator.apache.org/learn/tutorials/0.7.0/deploy-samza-job-from-hdfs.html]
>  and [Http | 
> https://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node-yarn.html],
>  but neither of them is out-of-box. Users have to go through the tutorial, 
> add dependencies, recompile, put the job package to HDFS or Http and then 
> finally run. I feel it is a little cumbersome sometimes. We maybe able to 
> provide a simpler way to deploy the job.
> When users have YARN and HDFS in the same cluster (such as CDH5), we can 
> provide a job-submit script which does:
> 1. take cluster configuration
> 2. call some jave code to upload the assembly (all the samza needed jars and 
> is already-compiled) and user's job jar (which changes frequently) to the HDFS
> 3. run the job as usual. 
> Therefore, the users only need to run one command line *instead of*:
> 1. going step by step from the tutorial during their first job
> 2. assembling all code and uploading to HDFS manually every time they make 
> changes to their job. 
> (Yes, I learnt it from [Spark's Yarn 
> deploy|http://spark.apache.org/docs/latest/running-on-yarn.html] and [their 
> code|https://github.com/apache/spark/blob/master/yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala]
>  ) 
> When users only have YARN, I think they have no way but start the http server 
> as tutorial. 
> What do you think? Does the simplification make sense? Or the Samza will have 
> some difficulties (issues) if we do the deploy in this way? Thank you.
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to