[ 
https://issues.apache.org/jira/browse/STORM-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15011415#comment-15011415
 ] 

ASF GitHub Bot commented on STORM-126:
--------------------------------------

Github user revans2 commented on the pull request:

    https://github.com/apache/storm/pull/884#issuecomment-157780934
  
    @schonfeld the reason for doing it per component is because you have a 
framework, like spring, that wants to be setup and cleaned up, but only a 
subset of the components use it.  I supposed having the worker hook run on 
workers that don't use spring probably is not that bad, but I can also see that 
if it adds extra overhead some people may want to turn it off.
    
    I am still +1 on this change.


> Add Lifecycle support API for worker nodes
> ------------------------------------------
>
>                 Key: STORM-126
>                 URL: https://issues.apache.org/jira/browse/STORM-126
>             Project: Apache Storm
>          Issue Type: New Feature
>          Components: storm-core
>            Reporter: James Xu
>            Assignee: Michael Schonfeld
>            Priority: Minor
>             Fix For: 0.11.0
>
>
> https://github.com/nathanmarz/storm/issues/155
> Storm is already used in variety of environments. It is important that Storm 
> provides some form of "lifecycle" API specified at Topology builder level to 
> be called when worker nodes start and stop.
> It is a very crucial functional piece that is missing from Storm. Many 
> project have to integrate, for example, with various container-like 
> frameworks like Spring or Google Guice that need to be started at stopped in 
> a controlled fashion before worker nodes begin or finish their work.
> I think something like a WorkerContextListener interface with two methods: 
> onStartup(SomeContextClass) and onShutdown(SomeContextClass) could go a very 
> long way for allowing to user's to plugin various third-party libraries 
> easily.
> Then, the TopologyBuilder needs to be modified to accept classes that 
> implement this interface.
> SomeContextClass does not need to be much more than a Map for now. But it is 
> important to have it as it allows propagation ofl information between those 
> lifecycle context listeners.
> Nathan, it would interesting to hear your opinion.
> Thanks!
> ----------
> nathanmarz: I agree, this should be added to Storm. The lifecycle methods 
> should be parameterized with which tasks are running in this worker.
> Additionally, I think lifecycle methods should be added for bolt/spouts in 
> the context of workers. Sometimes there's some code you want to run for a 
> spout/bolt within a worker only one time, regardless of how many tasks for 
> that bolt are within the worker. Then individual tasks should be able to 
> access that "global state" within the worker for that spout/bolt.
> ----------
> kyrill007: Thank you, Nathan, I think it would be relatively simple to 
> implement and would have big impact. Now we're forced to manage container 
> initializations through lazy static fields. You'd love to see that code. :-)
> ----------
> nathanmarz: Yup, this should be fairly easy to implement. I encourage you to 
> submit a patch for this.
> ----------
> kyrill007: Oh, well, my Clojure is unfortunately too weak for this kind of 
> work. But I am working on it... Any pointers as to where in Storm code 
> workers are started and stopped?
> ----------
> nathanmarz: Here's the function that's called to start a worker: 
> https://github.com/nathanmarz/storm/blob/master/src/clj/backtype/storm/daemon/worker.clj#L315
> And here's the code in the same file that shuts down a worker:
> https://github.com/nathanmarz/storm/blob/master/src/clj/backtype/storm/daemon/worker.clj#L352
> I think the interface for the lifecycle stuff should look something like this:
> interface WorkerHook extends Serializable {
> void start(Map conf, TopologyContext context, List taskIds);
> void shutdown();
> }
> You'll need to add a definition for worker hooks into the topology definition 
> Thrift structure:
> https://github.com/nathanmarz/storm/blob/master/src/storm.thrift#L91
> I think for the first go it's ok to make this a Java-only feature by adding 
> something like "4: list<binary> worker_hooks" to the StormTopology structure 
> (where the "binary" refers to a Java-serialized object).
> Then TopologyBuilder can have a simple "addWorkerHook" method that will 
> serialize the object and add it to the Thrift struct.
> ----------
> danehammer: I've started working on this. I've followed Nathan's proposed 
> design, but I keep hitting snags with calls to ThriftTopologyUtils, now that 
> there is an optional list on StormTopology.
> I would like to add some unit tests for what I change there, would it make 
> more sense for those to be in Java instead of Clojure? If so, are there any 
> strong preferences on what dependencies I add and how I go about adding Java 
> unit tests to storm-core?
> ----------
> nathanmarz: No... unit tests should remain in Clojure. You can run Java code 
> in Clojure very easily. Here's a good example of testing Java code in 
> Clojure: 
> https://github.com/nathanmarz/storm/blob/master/storm-core/test/clj/backtype/storm/security/auth/AuthUtils_test.clj
> ----------
> danehammer: For this design:
> interface WorkerHook extends Serializable {
>     void start(Map conf, TopologyContext context, List taskIds);
>     void shutdown();
> }
> When mk-worker has the worker hooks and goes to call start on them, where 
> should it get the topology context? I started with what was returned by 
> (worker-context worker), which returns a WorkerTopologyContext, but that 
> doesn't appear to be correct.
> ----------
> danehammer: If this helps, here's what I'm looking at right now: 
> https://github.com/danehammer/storm/blob/issue-155/storm-core/src/clj/backtype/storm/daemon/worker.clj#L329



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to