[ https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703640#comment-14703640 ]
Anubhav Dhoot commented on YARN-2005: ------------------------------------- [~jianhe] thanks for your comments bq. RMApp and RMAppAttempt need not be involved in the loop. In the current patch the following responsibilities are assigned to RMApp/RMAppAttempt. a) When a AM fails, add that host to the system blacklist b) Before launching the AM activate the system blacklist with current known AM failure hosts. c) After AM launch succeeds, deactivate the system blacklist to avoid impacting other user allocations. Since AM launch is responsibility of RMAppAttempt I kept all of these there. Can you please elaborate where and how would these be done SchedulerApplication/AppSchedulingInfo in a clean way? Thanks. > Blacklisting support for scheduling AMs > --------------------------------------- > > Key: YARN-2005 > URL: https://issues.apache.org/jira/browse/YARN-2005 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager > Affects Versions: 0.23.10, 2.4.0 > Reporter: Jason Lowe > Assignee: Anubhav Dhoot > Attachments: YARN-2005.001.patch, YARN-2005.002.patch, > YARN-2005.003.patch, YARN-2005.004.patch, YARN-2005.005.patch, > YARN-2005.006.patch, YARN-2005.006.patch > > > It would be nice if the RM supported blacklisting a node for an AM launch > after the same node fails a configurable number of AM attempts. This would > be similar to the blacklisting support for scheduling task attempts in the > MapReduce AM but for scheduling AM attempts on the RM side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)