[ https://issues.apache.org/jira/browse/YARN-8581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588006#comment-16588006 ]
Botong Huang commented on YARN-8581: ------------------------------------ Thanks [~giovanni.fumarola] for the review! > [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy > ----------------------------------------------------------------------- > > Key: YARN-8581 > URL: https://issues.apache.org/jira/browse/YARN-8581 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy, federation > Reporter: Botong Huang > Assignee: Botong Huang > Priority: Major > Attachments: YARN-8581-branch-2.v2.patch, YARN-8581.v1.patch, > YARN-8581.v2.patch > > > In Federation, every time an AM heartbeat comes in, > LocalityMulticastAMRMProxyPolicy in AMRMProxy splits the asks according to > the list of active and enabled sub-clusters. However, if we haven't been able > to heartbeat to a sub-cluster for some time (network issues, or we keep > hitting some exception from YarnRM, or YarnRM master-slave switch is taking a > long time etc.), we should consider the sub-cluster as unhealthy and stop > routing asks there, until the heartbeat channel becomes healthy again. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org