[jira] [Commented] (SLIDER-1246) Application health should not be affected by faulty nodes

ASF subversion and git services (JIRA) Sun, 01 Oct 2017 22:18:29 -0700

    [ 
https://issues.apache.org/jira/browse/SLIDER-1246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16187666#comment-16187666
 ]


ASF subversion and git services commented on SLIDER-1246:
---------------------------------------------------------

Commit 0f436c865a90aba5b427d1c0571183c6fcbded1e in incubator-slider's branch 
refs/heads/develop from [~gsaha]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-slider.git;h=0f436c8 ]

SLIDER-1246 Application health should not be affected by faulty nodes (health 
monitor based on percent threshold)


> Application health should not be affected by faulty nodes
> ---------------------------------------------------------
>
>                 Key: SLIDER-1246
>                 URL: https://issues.apache.org/jira/browse/SLIDER-1246
>             Project: Slider
>          Issue Type: Bug
>          Components: appmaster, core
>    Affects Versions: Slider 0.92
>            Reporter: Prasanth Jayachandran
>            Assignee: Gour Saha
>             Fix For: Slider 1.0.0
>
>         Attachments: SLIDER-1246.01.patch, SLIDER-1246.02.patch, 
> SLIDER-1246.03.patch, SLIDER-1246.04.patch
>
>
> In case of a faulty node, multiple container failures will be deemed as an 
> application failure. 
> Observed this in HIVE-16927, where container failures in certain nodes brings 
> down entire application. Slider has to provide a way to not mark application 
> as unhealthy if certain threshold of containers are running. Tuning failure 
> threshold is not optimal as setting the correct default on large cluster is 
> not trivial. Beyond certain failures, slider should mark the node as 
> unhealthy and report that back to client/AM. Application could continue to 
> run as long as container request is satisfied partially (example: 80% 
> containers are running).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (SLIDER-1246) Application health should not be affected by faulty nodes

Reply via email to