[ 
https://issues.apache.org/jira/browse/STORM-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14357342#comment-14357342
 ] 

ASF GitHub Bot commented on STORM-532:
--------------------------------------

Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/296#discussion_r26240829
  
    --- Diff: storm-core/src/clj/backtype/storm/util.clj ---
    @@ -392,6 +392,15 @@
           (.addArgument command a))
         (.execute (DefaultExecutor.) command)))
     
    +(defn exists-process?
    +   [process-id]
    +   (let [line (if on-windows? (str "cmd /c \"tasklist /FI \"PID eq "  
process-id  "\" | findstr "  process-id  "\"" )
    +                              (str "ps -p "  process-id))]
    +        (try-cause
    +           (exec-command! line)
    --- End diff --
    
    For me the supervisor is echoing the output of the ps command a lot.  It is 
a pain and makes debugging things difficult in some cases. Could we do 
something to not have them printed except on an error?
    
    It would also be nice if we could have something that is linux specific 
that makes it so we don't need to run ps, but instead use /proc/ directly?


> Supervisor should restart worker immediately, if the worker process does not 
> exist any more 
> --------------------------------------------------------------------------------------------
>
>                 Key: STORM-532
>                 URL: https://issues.apache.org/jira/browse/STORM-532
>             Project: Apache Storm
>          Issue Type: Improvement
>    Affects Versions: 0.10.0
>            Reporter: caofangkun
>            Assignee: caofangkun
>            Priority: Minor
>
> For now 
> if the worker process does not exist any more 
> Supervisor will have to wait a few seconds for worker heartbeart timeout and 
> restart worker .
> If supervisor knows the worker processid  and check if the process exists in 
> the sync-processes thread ,may need less time to restart worker.
> 1: record worker process id in the worker local heartbeart 
> 2: in supervisor  sync-processes ,get process id from worker local heartbeat 
> and check if the process exits 
> 3: if not restart it immediately



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to