[ https://issues.apache.org/jira/browse/STORM-442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191297#comment-14191297 ]
ASF GitHub Bot commented on STORM-442: -------------------------------------- Github user dashengju commented on the pull request: https://github.com/apache/storm/pull/305#issuecomment-61213166 We have reproduced the problem: ShellProcess#getErrorsStream is hang when only parent process raises exception and subprocess still work. as @d2r 's suggestion, we changed ShellProcess#getErrorsStream so that it checks that there is actually something to read, as ShellProcess#logErrorStream does, the problem is solved. @d2r , @revans2 , please help to review this pull request. > multilang ShellBolt/ShellSpout die() can be hang when Exception happened > ------------------------------------------------------------------------ > > Key: STORM-442 > URL: https://issues.apache.org/jira/browse/STORM-442 > Project: Apache Storm > Issue Type: Bug > Affects Versions: 0.9.3 > Reporter: DashengJu > > In ShellBolt, the _readerThread read command from python/shell process, and > handle like this: > try { > ShellMsg shellMsg = _process.readShellMsg(); > ... > } catch (InterruptedException e) { > } catch (Throwable t) { > die(t); > } > And in the die function, getProcessTerminationInfoString will read > getErrorsString() from processErrorStream. > private void die(Throwable exception) { > > String processInfo = _process.getProcessInfoString() + > _process.getProcessTerminationInfoString(); > > _exception = new RuntimeException(processInfo, exception); > > } > so when ShellBolt got exception(for example, readShellMsg() throw NPE ) , > but it is not an error from sub process, then > getProcessTerminationInfoString will be hang because processErrorStream have > no data to read. > On the other hand, as [~xiaokang] says ShellBolt should fail fast on > exception ( https://github.com/apache/incubator-storm/pull/46 ) , I think it > is not a good idea to read error info from stream. > Because [~xiaokang] 's PR is based old version, so I will move his code to > this PR, and modify some other place in ShellSpout. -- This message was sent by Atlassian JIRA (v6.3.4#6332)