[
https://issues.apache.org/jira/browse/SAMZA-516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288450#comment-14288450
]
Chris Riccomini commented on SAMZA-516:
---------------------------------------
bq. Is the run-job.sh executed for each job on each node?
In the idea I posted above, yes.
bq. Is it worth thinking of running a daemon process that registers the
available daemon processes to process_group on ZK, and then each daemon process
watches a jobs ZK directory for new job node?
I think there's a trade-off here. This approach would support multiple jobs,
which would be nice for things like query language (SAMZA-390). It also adds
complexity for end-users (the search space on where their logs are is now a lot
larger--all the machines in the grid, not just the N machines that they ran
run-job.sh on), and the operator (we would need a way to shift binaries around
to the appropriate machine, now).
If you squint a bit, this proposal looks pretty similar to YARN/Mesos, but
without any sophisticated scheduling. You just get full machines, and you
either have machines available, or you don't. My initial inclination is that,
if you want this, you should just use YARN or Mesos. I will think it through,
and add it to the document, though. I currently have 3 potential solutions:
# Embedded YARN
# ZK-based approach (as described in my original post)
# ZK-daemon approach (as you described)
I will try and think through them, and post the design doc.
> Support standalone Samza jobs
> -----------------------------
>
> Key: SAMZA-516
> URL: https://issues.apache.org/jira/browse/SAMZA-516
> Project: Samza
> Issue Type: Bug
> Components: container
> Affects Versions: 0.9.0
> Reporter: Chris Riccomini
> Assignee: Chris Riccomini
>
> Samza currently supports two modes of operation out of the box: local and
> YARN. With local mode, a single Java process starts the JobCoordinator,
> creates a single container, and executes it locally. All partitions are
> procesed within this container. With YARN, a YARN grid is required to
> execute the Samza job. In addition, SAMZA-375 introduces a patch to run Samza
> in Mesos.
> There have been several requests lately to be able to run Samza jobs without
> any resource manager (YARN, Mesos, etc), but still run it in a distributed
> fashion.
> The goal of this ticket is to design and implement a samza-standalone module,
> which will:
> # Support executing a single Samza job in one or more containers.
> # Support failover, in cases where a machine is lost.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)