[ 
https://issues.apache.org/jira/browse/STORM-442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189590#comment-14189590
 ] 

ASF GitHub Bot commented on STORM-442:
--------------------------------------

GitHub user dashengju opened a pull request:

    https://github.com/apache/storm/pull/305

    [STORM-442] multilang ShellBolt/ShellSpout die() can be hang when Exception 
happened

    @d2r , @harshach , @HeartSaVioR ,  I closed the old pull request by 
accident, so I opened a new one, and move all the comments to here.
    
------------------------------------------------------------------------------------------------
    In ShellBolt, the _readerThread read command from python/shell process, and 
handle like this:
    try
    { 
    ShellMsg shellMsg = _process.readShellMsg(); 
    ... 
    } catch (InterruptedException e) {
    } catch (Throwable t) { 
    die(t); 
    }
    
    And in the die function, getProcessTerminationInfoString will read 
getErrorsString() from processErrorStream.
    
    private void die(Throwable exception) { 
    String processInfo = _process.getProcessInfoString() + 
_process.getProcessTerminationInfoString(); 
    _exception = new RuntimeException(processInfo, exception); 
    }
    
    so when ShellBolt got exception(for example, readShellMsg() throw NPE ) , 
but it is not an error from sub process, then getProcessTerminationInfoString 
will be hang because processErrorStream have no data to read.
    
    
------------------------------------------------------------------------------------------------
    harshach commented on 21 Aug
    @dashengju Can you clarify "getProcessTerminationInfoString will be hang 
because processErrorStream have no data to read". Which case it might hang , 
looking at the code 
https://github.com/apache/incubator-storm/blob/master/storm-core/src/jvm/backtype/storm/utils/ShellProcess.java#L135
 If there is no data in processErrorStream it returns empty string.
    If possible can you add a unit test. Thanks.
    
    
------------------------------------------------------------------------------------------------
    d2r added a note 29 days ago
    I think we should change ShellProcess#getErrorsStream so that it checks 
that there is actually something to read.
    
    This would fix a couple of other places where a hang looks possible:
    
    
https://github.com/apache/storm/blob/24b5eefb1553ae37951807502a433f24d3e95ec1/storm-core/src/jvm/backtype/storm/utils/ShellProcess.java#L68
    
https://github.com/apache/storm/blob/24b5eefb1553ae37951807502a433f24d3e95ec1/storm-core/src/jvm/backtype/storm/utils/ShellProcess.java#L70
    
https://github.com/apache/storm/blob/24b5eefb1553ae37951807502a433f24d3e95ec1/storm-core/src/jvm/backtype/storm/utils/ShellProcess.java#L101
    
    ShellProcess#logErrorStream does this.
    
    
------------------------------------------------------------------------------------------------
    d2r commented a day ago
    Hi @dashengju, any update on this pull request?
    
    
------------------------------------------------------------------------------------------------
 
    dashengju commented 15 hours ago
    @d2r I had no time work on this pull request those days. 
    I will verify the problem, make some unit test and submit update on the 
pull request next few days.
    any suggestion?
    
    
------------------------------------------------------------------------------------------------
    HeartSaVioR commented 5 hours ago
    Is there a relation between parent process's exception and subprocess's 
stderr?
    If it's not, and we should capture it from parent, I think spout may have 
stderr reader with new thread.
    Because it may not be right way to capture it when only parent process 
raises exception.
    
    
------------------------------------------------------------------------------------------------
 
    dashengju commented an hour ago
    @HeartSaVioR , 
    parent process's exception contains two types: 1) net io error caused; for 
example, subprocess report an error to stderr and exit; 2) parent process's 
handle logic exception.
    as 1), we want to read from stderr to get more information; 
    as 2), as you say, it may not be right way to capture it when only parent 
process raises exception.
    
    @d2r 's suggestion, we should change ShellProcess#getErrorsStream so that 
it checks that there is actually something to read.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dashengju/storm Multilang_die_hang

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/305.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #305
    
----
commit ac5a18dc30ee6e9257eb5c18c0c87e7c7e944b23
Author: JuDasheng <judash...@meituan.com>
Date:   2014-10-30T03:46:21Z

    change ShellProcess#getErrorsStream so that it checks that there is 
actually something to read.

----


> multilang ShellBolt/ShellSpout die() can be hang when Exception happened
> ------------------------------------------------------------------------
>
>                 Key: STORM-442
>                 URL: https://issues.apache.org/jira/browse/STORM-442
>             Project: Apache Storm
>          Issue Type: Bug
>    Affects Versions: 0.9.3
>            Reporter: DashengJu
>
> In ShellBolt,  the _readerThread read command from python/shell process, and 
> handle like this:
>  try {
>         ShellMsg shellMsg = _process.readShellMsg();
>         ...                
>  } catch (InterruptedException e) {
>  } catch (Throwable t) {
>         die(t);
>  }
> And in the die function, getProcessTerminationInfoString will read 
> getErrorsString() from processErrorStream.
>  private void die(Throwable exception) {
>  
>          String processInfo = _process.getProcessInfoString() + 
> _process.getProcessTerminationInfoString();
>  
>          _exception = new RuntimeException(processInfo, exception);
>  
>  }
> so when ShellBolt got exception(for example, readShellMsg() throw NPE ) ,  
> but it is not an error from sub process,  then 
> getProcessTerminationInfoString will be hang because processErrorStream have 
> no data to read.
> On the other hand, as [~xiaokang] says ShellBolt should fail fast on 
> exception ( https://github.com/apache/incubator-storm/pull/46 ) , I think it 
> is not a good idea to read error info from stream.
> Because [~xiaokang] 's PR is based old version, so I will move his code to 
> this PR, and modify some other place in ShellSpout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to