[ 
https://issues.apache.org/jira/browse/YARN-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15481979#comment-15481979
 ] 

Naganarasimha G R commented on YARN-5567:
-----------------------------------------

[~aw], I understand that with typo in the health check script can bring down 
the whole cluster hence we need to revert this, but at the same time with 
erroneous script there could be possibility that the script missed to detect 
some health check failures on the node ?
Should we think of some other state which could warn the admin about this(which 
is captured in webui/Rest) ?

> Fix script exit code checking in NodeHealthScriptRunner#reportHealthStatus
> --------------------------------------------------------------------------
>
>                 Key: YARN-5567
>                 URL: https://issues.apache.org/jira/browse/YARN-5567
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.8.0, 3.0.0-alpha1
>            Reporter: Yufei Gu
>            Assignee: Yufei Gu
>             Fix For: 3.0.0-alpha1
>
>         Attachments: YARN-5567.001.patch
>
>
> In case of FAILED_WITH_EXIT_CODE, health status should be false.
> {code}
>       case FAILED_WITH_EXIT_CODE:
>         setHealthStatus(true, "", now);
>         break;
> {code}
> should be 
> {code}
>       case FAILED_WITH_EXIT_CODE:
>         setHealthStatus(false, "", now);
>         break;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to