[ https://issues.apache.org/jira/browse/YARN-6955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Botong Huang updated YARN-6955: ------------------------------- Attachment: YARN-6955.v1.patch > Concurrent registerAM thread in Federation Interceptor > ------------------------------------------------------ > > Key: YARN-6955 > URL: https://issues.apache.org/jira/browse/YARN-6955 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Botong Huang > Assignee: Botong Huang > Priority: Minor > Attachments: YARN-6955.v1.patch > > > The timeout between AM and AMRMProxy is shorter than the timeout + failOver > between FederationInterceptor (AMRMProxy) and RM. When the first register > thread in FI is blocked because of an RM failover, AM can timeout and resend > register call, leading to two outstanding register call inside FI. > Eventually when RM comes back up, one thread succeeds register and the other > thread got an application already registered exception. FI should swallow the > exception and return success back to AM in both threads. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org