[
https://issues.apache.org/jira/browse/SAMZA-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14088455#comment-14088455
]
Chris Riccomini commented on SAMZA-364:
---------------------------------------
bq. This seems useful : assuming suitable timeouts are in place - in case the
user code / script for a given hook does something weird.
I hadn't considered timeouts. I see this as no different from a
StreamTask.process method that takes a long time to process. In that case, we
don't have a timeout, the user's code is just slow. I had been thinking of this
the same way. There may also be cases where a timeout is actually invalid since
it might lead to bad state (equivalent of a dropped message). I'm not convinced
that we need this.
bq. Also, is recovery (for a given failed task) the same as startup ? Or should
we treat it as a separate injection point ?
Interesting. I hadn't thought about this. I don't have any use cases coming to
mind, but I'm sure there are some. For instance, setting external stuff up the
first time a container runs seems useful (e.g. create a table, init a DB, etc).
There are multiple points related here: 1st job start, 1st container start,
subsequent container start after failure, subsequent container start after
successful shutdown. One trick is going to be knowing which
We'd also have to figure out how a container will know what kind of startup
it's doing (recover, first time, etc).
> Refactor lifecycle listener APIs
> --------------------------------
>
> Key: SAMZA-364
> URL: https://issues.apache.org/jira/browse/SAMZA-364
> Project: Samza
> Issue Type: Bug
> Components: container
> Affects Versions: 0.8.0
> Reporter: Chris Riccomini
>
> There are a many cases where a developer might like to plug into different
> points throughout the lifecycle of a Samza job's execution. Example use cases
> for this include:
> * Wanting to initialize setup/tear down of a slf4j log binding (SAMZA-350)
> * Wanting to initialize remote systems that the job interacts with.
> * Wanting to setup a custom HTTP (or some other) server for operational
> purposes.
> Potential injection points include:
> # SamzaContainer startup/shutdown.
> # Before/after processing a message in a TaskInstance.
> # Before/after init'ing a TaskInstance.
> # Before/after shutting down a TaskInstance.
> # Before/after windowing a TaskInstance.
> # Before/after committing offsets.
> # Before/after a job is started.
> # All of the YARN AM related events.
> Right now, we have a TaskLifecycleListener. This is insufficient to cover all
> cases, and is somewhat cumbersome to deal with because we instantiate one
> listener per TaskInstance in a container.
> We should refactor and re-design the lifecycle APIs for Samza. This will
> include investigating how other servlets/containers handle things, writing a
> doc, and implementing the new API.
> I expect that this will result in backwards incompatible changes with
> TaskLifecycleListener. I think this is OK.
--
This message was sent by Atlassian JIRA
(v6.2#6252)