[
https://issues.apache.org/jira/browse/STORM-143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Erik Weathers resolved STORM-143.
---------------------------------
Resolution: Fixed
Fix Version/s: 0.10.0
> Launching a process throws away standard out; can hang
> ------------------------------------------------------
>
> Key: STORM-143
> URL: https://issues.apache.org/jira/browse/STORM-143
> Project: Apache Storm
> Issue Type: Bug
> Components: storm-core
> Reporter: James Xu
> Priority: Minor
> Fix For: 0.10.0
>
>
> https://github.com/nathanmarz/storm/issues/489
> https://github.com/nathanmarz/storm/blob/master/src/clj/backtype/storm/util.clj#L349
> When we launch a process, standard out is written to a system buffer and does
> not appear to be read. Also, nothing is redirected to standard in. This can
> have the following effects:
> A worker can hang when initializing (e.g. UnsatisfiedLinkError looking for
> jzmq), and it will be unable to communicate the error as standard out is
> being swallowed.
> A process that writes too much to standard out will block if the buffer fills
> A process that tries to read form standard in for any reason will block.
> Perhaps we can redirect standard out to an .out file, and redirect /dev/null
> to the standard in stream of the process?
> ----------
> nathanmarz: Storm redirects stdout to the logging system. It's worked fine
> for us in our topologies.
> ----------
> d2r: We see in worker.clj, in mk-worker, where there is a call to
> redirect-stdio-to-slf4j!. This would not seem to help in cases such as we are
> seeing when there is a problem launching the worker itself.
> (defn -main [storm-id assignment-id port-str worker-id]
> (let [conf1 (read-storm-config)
> login_conf_file (System/getProperty "java.security.auth.login.config")
> conf (if login_conf_file (merge conf1
> {"java.security.auth.login.config" login_conf_file}) conf1)]
> (validate-distributed-mode! conf)
> (mk-worker conf nil (java.net.URLDecoder/decode storm-id) assignment-id
> (Integer/parseInt port-str) worker-id)))
> If anything were to go wrong (CLASSPATH, jvm opts, misconfiguration...)
> before -main or before mk-worker, then any output would be lost. The symptom
> we saw was that the topology sat around apparently doing nothing, yet there
> was no log indicating that the workers were failing to start.
> Is there other redirection to logs that I'm missing?
> ----------
> xiaokang: we use bash to launch worker process and redirect its stdout to
> woker-port.out file. it heleped us find the zeromq jni problem that cause the
> jvm crash without any log.
> ----------
> nathanmarz: @d2r Yea, that's all I was referring to. If we redirect stdout,
> will the code that redirects stdout to the logging system still take effect?
> This is important because we can control the size of the logfiles (via the
> logback config) but not the size of the redirected stdout file.
> ----------
> d2r: My hunch is that it will work as it does now, except that any messages
> that are getting thrown away before that point would go to a file instead. I
> can play with it and find out. We wouldn't want to change the redirection,
> just restore visibility to any output that might occur prior to the
> redirection. There should be some safety valve to control the size of any new
> .out in case something goes berserk.
> @xiaokang I see how that would work. We also need to make sure redirection
> continues to work as it currently does for the above reason.
> ----------
> xiaokang: @d2r @nathanmarz In out cluster, storm's stdout redirection still
> works for any System.out output while JNI errors goes to worker-port.out
> file. I think it will be nice to use the same worker-port.log file for bash
> stdout redirection since logback can control log file size. But it is a
> little bit ugly to use bash to launch worker java process.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)