[ 
https://issues.apache.org/jira/browse/STORM-143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Weathers resolved STORM-143.
---------------------------------
       Resolution: Fixed
    Fix Version/s: 0.10.0

> Launching a process throws away standard out; can hang
> ------------------------------------------------------
>
>                 Key: STORM-143
>                 URL: https://issues.apache.org/jira/browse/STORM-143
>             Project: Apache Storm
>          Issue Type: Bug
>          Components: storm-core
>            Reporter: James Xu
>            Priority: Minor
>             Fix For: 0.10.0
>
>
> https://github.com/nathanmarz/storm/issues/489
> https://github.com/nathanmarz/storm/blob/master/src/clj/backtype/storm/util.clj#L349
> When we launch a process, standard out is written to a system buffer and does 
> not appear to be read. Also, nothing is redirected to standard in. This can 
> have the following effects:
> A worker can hang when initializing (e.g. UnsatisfiedLinkError looking for 
> jzmq), and it will be unable to communicate the error as standard out is 
> being swallowed.
> A process that writes too much to standard out will block if the buffer fills
> A process that tries to read form standard in for any reason will block.
> Perhaps we can redirect standard out to an .out file, and redirect /dev/null 
> to the standard in stream of the process?
> ----------
> nathanmarz: Storm redirects stdout to the logging system. It's worked fine 
> for us in our topologies.
> ----------
> d2r: We see in worker.clj, in mk-worker, where there is a call to 
> redirect-stdio-to-slf4j!. This would not seem to help in cases such as we are 
> seeing when there is a problem launching the worker itself.
> (defn -main [storm-id assignment-id port-str worker-id]
>   (let [conf1 (read-storm-config)
>         login_conf_file (System/getProperty "java.security.auth.login.config")
>         conf (if login_conf_file (merge conf1 
> {"java.security.auth.login.config" login_conf_file}) conf1)]
>     (validate-distributed-mode! conf)
>     (mk-worker conf nil (java.net.URLDecoder/decode storm-id) assignment-id 
> (Integer/parseInt port-str) worker-id)))
> If anything were to go wrong (CLASSPATH, jvm opts, misconfiguration...) 
> before -main or before mk-worker, then any output would be lost. The symptom 
> we saw was that the topology sat around apparently doing nothing, yet there 
> was no log indicating that the workers were failing to start.
> Is there other redirection to logs that I'm missing?
> ----------
> xiaokang: we use bash to launch worker process and redirect its stdout to 
> woker-port.out file. it heleped us find the zeromq jni problem that cause the 
> jvm crash without any log.
> ----------
> nathanmarz: @d2r Yea, that's all I was referring to. If we redirect stdout, 
> will the code that redirects stdout to the logging system still take effect? 
> This is important because we can control the size of the logfiles (via the 
> logback config) but not the size of the redirected stdout file.
> ----------
> d2r: My hunch is that it will work as it does now, except that any messages 
> that are getting thrown away before that point would go to a file instead. I 
> can play with it and find out. We wouldn't want to change the redirection, 
> just restore visibility to any output that might occur prior to the 
> redirection. There should be some safety valve to control the size of any new 
> .out in case something goes berserk.
> @xiaokang I see how that would work. We also need to make sure redirection 
> continues to work as it currently does for the above reason.
> ----------
> xiaokang: @d2r @nathanmarz In out cluster, storm's stdout redirection still 
> works for any System.out output while JNI errors goes to worker-port.out 
> file. I think it will be nice to use the same worker-port.log file for bash 
> stdout redirection since logback can control log file size. But it is a 
> little bit ugly to use bash to launch worker java process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to