GitHub user dashengju opened a pull request:

    https://github.com/apache/storm/pull/305

    [STORM-442] multilang ShellBolt/ShellSpout die() can be hang when Exception 
happened

    @d2r , @harshach , @HeartSaVioR ,  I closed the old pull request by 
accident, so I opened a new one, and move all the comments to here.
    
------------------------------------------------------------------------------------------------
    In ShellBolt, the _readerThread read command from python/shell process, and 
handle like this:
    try
    { 
    ShellMsg shellMsg = _process.readShellMsg(); 
    ... 
    } catch (InterruptedException e) {
    } catch (Throwable t) { 
    die(t); 
    }
    
    And in the die function, getProcessTerminationInfoString will read 
getErrorsString() from processErrorStream.
    
    private void die(Throwable exception) { 
    String processInfo = _process.getProcessInfoString() + 
_process.getProcessTerminationInfoString(); 
    _exception = new RuntimeException(processInfo, exception); 
    }
    
    so when ShellBolt got exception(for example, readShellMsg() throw NPE ) , 
but it is not an error from sub process, then getProcessTerminationInfoString 
will be hang because processErrorStream have no data to read.
    
    
------------------------------------------------------------------------------------------------
    harshach commented on 21 Aug
    @dashengju Can you clarify "getProcessTerminationInfoString will be hang 
because processErrorStream have no data to read". Which case it might hang , 
looking at the code 
https://github.com/apache/incubator-storm/blob/master/storm-core/src/jvm/backtype/storm/utils/ShellProcess.java#L135
 If there is no data in processErrorStream it returns empty string.
    If possible can you add a unit test. Thanks.
    
    
------------------------------------------------------------------------------------------------
    d2r added a note 29 days ago
    I think we should change ShellProcess#getErrorsStream so that it checks 
that there is actually something to read.
    
    This would fix a couple of other places where a hang looks possible:
    
    
https://github.com/apache/storm/blob/24b5eefb1553ae37951807502a433f24d3e95ec1/storm-core/src/jvm/backtype/storm/utils/ShellProcess.java#L68
    
https://github.com/apache/storm/blob/24b5eefb1553ae37951807502a433f24d3e95ec1/storm-core/src/jvm/backtype/storm/utils/ShellProcess.java#L70
    
https://github.com/apache/storm/blob/24b5eefb1553ae37951807502a433f24d3e95ec1/storm-core/src/jvm/backtype/storm/utils/ShellProcess.java#L101
    
    ShellProcess#logErrorStream does this.
    
    
------------------------------------------------------------------------------------------------
    d2r commented a day ago
    Hi @dashengju, any update on this pull request?
    
    
------------------------------------------------------------------------------------------------
 
    dashengju commented 15 hours ago
    @d2r I had no time work on this pull request those days. 
    I will verify the problem, make some unit test and submit update on the 
pull request next few days.
    any suggestion?
    
    
------------------------------------------------------------------------------------------------
    HeartSaVioR commented 5 hours ago
    Is there a relation between parent process's exception and subprocess's 
stderr?
    If it's not, and we should capture it from parent, I think spout may have 
stderr reader with new thread.
    Because it may not be right way to capture it when only parent process 
raises exception.
    
    
------------------------------------------------------------------------------------------------
 
    dashengju commented an hour ago
    @HeartSaVioR , 
    parent process's exception contains two types: 1) net io error caused; for 
example, subprocess report an error to stderr and exit; 2) parent process's 
handle logic exception.
    as 1), we want to read from stderr to get more information; 
    as 2), as you say, it may not be right way to capture it when only parent 
process raises exception.
    
    @d2r 's suggestion, we should change ShellProcess#getErrorsStream so that 
it checks that there is actually something to read.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dashengju/storm Multilang_die_hang

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/305.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #305
    
----
commit ac5a18dc30ee6e9257eb5c18c0c87e7c7e944b23
Author: JuDasheng <[email protected]>
Date:   2014-10-30T03:46:21Z

    change ShellProcess#getErrorsStream so that it checks that there is 
actually something to read.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to