[ https://issues.apache.org/jira/browse/STORM-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259806#comment-15259806 ]
Erik Weathers commented on STORM-143: ------------------------------------- [~revans2] : seems this issue is fixed with the LogWriter that was introduced in storm-0.10.0. I cannot find a ticket for that feature to link this against though. > Launching a process throws away standard out; can hang > ------------------------------------------------------ > > Key: STORM-143 > URL: https://issues.apache.org/jira/browse/STORM-143 > Project: Apache Storm > Issue Type: Bug > Components: storm-core > Reporter: James Xu > Priority: Minor > > https://github.com/nathanmarz/storm/issues/489 > https://github.com/nathanmarz/storm/blob/master/src/clj/backtype/storm/util.clj#L349 > When we launch a process, standard out is written to a system buffer and does > not appear to be read. Also, nothing is redirected to standard in. This can > have the following effects: > A worker can hang when initializing (e.g. UnsatisfiedLinkError looking for > jzmq), and it will be unable to communicate the error as standard out is > being swallowed. > A process that writes too much to standard out will block if the buffer fills > A process that tries to read form standard in for any reason will block. > Perhaps we can redirect standard out to an .out file, and redirect /dev/null > to the standard in stream of the process? > ---------- > nathanmarz: Storm redirects stdout to the logging system. It's worked fine > for us in our topologies. > ---------- > d2r: We see in worker.clj, in mk-worker, where there is a call to > redirect-stdio-to-slf4j!. This would not seem to help in cases such as we are > seeing when there is a problem launching the worker itself. > (defn -main [storm-id assignment-id port-str worker-id] > (let [conf1 (read-storm-config) > login_conf_file (System/getProperty "java.security.auth.login.config") > conf (if login_conf_file (merge conf1 > {"java.security.auth.login.config" login_conf_file}) conf1)] > (validate-distributed-mode! conf) > (mk-worker conf nil (java.net.URLDecoder/decode storm-id) assignment-id > (Integer/parseInt port-str) worker-id))) > If anything were to go wrong (CLASSPATH, jvm opts, misconfiguration...) > before -main or before mk-worker, then any output would be lost. The symptom > we saw was that the topology sat around apparently doing nothing, yet there > was no log indicating that the workers were failing to start. > Is there other redirection to logs that I'm missing? > ---------- > xiaokang: we use bash to launch worker process and redirect its stdout to > woker-port.out file. it heleped us find the zeromq jni problem that cause the > jvm crash without any log. > ---------- > nathanmarz: @d2r Yea, that's all I was referring to. If we redirect stdout, > will the code that redirects stdout to the logging system still take effect? > This is important because we can control the size of the logfiles (via the > logback config) but not the size of the redirected stdout file. > ---------- > d2r: My hunch is that it will work as it does now, except that any messages > that are getting thrown away before that point would go to a file instead. I > can play with it and find out. We wouldn't want to change the redirection, > just restore visibility to any output that might occur prior to the > redirection. There should be some safety valve to control the size of any new > .out in case something goes berserk. > @xiaokang I see how that would work. We also need to make sure redirection > continues to work as it currently does for the above reason. > ---------- > xiaokang: @d2r @nathanmarz In out cluster, storm's stdout redirection still > works for any System.out output while JNI errors goes to worker-port.out > file. I think it will be nice to use the same worker-port.log file for bash > stdout redirection since logback can control log file size. But it is a > little bit ugly to use bash to launch worker java process. -- This message was sent by Atlassian JIRA (v6.3.4#6332)