[
https://issues.apache.org/jira/browse/SAMZA-307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14045587#comment-14045587
]
Yan Fang commented on SAMZA-307:
--------------------------------
Oops, sorry for generating so many changing notification...(I did not realize
the JIRA is so sensitive...)
Yes, I feel providing a deploy script to automate the process will definitely
help run job more smoothly.
> Simplify YARN deploy procedure
> -------------------------------
>
> Key: SAMZA-307
> URL: https://issues.apache.org/jira/browse/SAMZA-307
> Project: Samza
> Issue Type: Improvement
> Reporter: Yan Fang
>
> Currently, we have two ways of deploying the samza job to YARN cluster, from
> [HDFS|https://samza.incubator.apache.org/learn/tutorials/0.7.0/deploy-samza-job-from-hdfs.html]
> and [Http |
> https://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node-yarn.html],
> but neither of them is out-of-box. Users have to go through the tutorial,
> add dependencies, recompile, put the job package to HDFS or Http and then
> finally run. I feel it is a little cumbersome sometimes. We maybe able to
> provide a simpler way to deploy the job.
> When users have YARN and HDFS in the same cluster (such as CDH5), we can
> provide a job-submit script which does:
> 1. take cluster configuration
> 2. call some jave code to upload the assembly (all the samza needed jars and
> is already-compiled) and user's job jar (which changes frequently) to the HDFS
> 3. run the job as usual.
> Therefore, the users only need to run one command line *instead of*:
> 1. going step by step from the tutorial during their first job
> 2. assembling all code and uploading to HDFS manually every time they make
> changes to their job.
> (Yes, I learnt it from [Spark's Yarn
> deploy|http://spark.apache.org/docs/latest/running-on-yarn.html] and [their
> code|https://github.com/apache/spark/blob/master/yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala]
> )
> When users only have YARN, I think they have no way but start the http server
> as tutorial.
> What do you think? Does the simplification make sense? Or the Samza will have
> some difficulties (issues) if we do the deploy in this way? Thank you.
>
--
This message was sent by Atlassian JIRA
(v6.2#6252)