[
https://issues.apache.org/jira/browse/SAMZA-42?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266414#comment-14266414
]
Chris Riccomini commented on SAMZA-42:
--------------------------------------
Linking to SAMZA-348, since the JobCoordinator would likely be the place where
we'd want the job setup phase to be.
> Add a job setup phase to Samza
> ------------------------------
>
> Key: SAMZA-42
> URL: https://issues.apache.org/jira/browse/SAMZA-42
> Project: Samza
> Issue Type: Bug
> Components: container
> Affects Versions: 0.6.0
> Reporter: Chris Riccomini
>
> We have several use cases for doing things once at the beginning of a Samza
> job's execution (before containers start). Examples:
> * Validate or create checkpoint topic (if using KafkaCheckpointManager)
> * Validate or create state topic (if using LoggedStore)
> Right now, we have to do this in the container, which means that there's a
> race condition when running on YARN, as each container will try to create the
> same topic.
> Initially, I thought this logic could be put in the YARN AM, but then we'd
> have to put corresponding logic in the LocalJobFactory. This gets problematic
> if we implement SAMZA-41, since there would no longer be a central place to
> do a "before job starts" operation with the LocalJobFactory. If we don't do
> SAMZA-41, then we should be fine putting this logic in the YARN AM and
> LocalJobFactory.
> Alternatively, we could put this logic in JobRunner. One downside to this is
> that it would mean the JobRunner would need full access to the grid that it
> was trying to execute on (not just the RM) so that it could talk to
> Kafka/ZooKeeper (for example). I think this is actually fine, since we always
> execute our jobs from a spot that has access to the full grid.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)