[ https://issues.apache.org/jira/browse/YARN-6955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116965#comment-16116965 ]
Botong Huang commented on YARN-6955: ------------------------------------ The unit test failures are irrelevant. > Concurrent registerAM thread in Federation Interceptor > ------------------------------------------------------ > > Key: YARN-6955 > URL: https://issues.apache.org/jira/browse/YARN-6955 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Botong Huang > Assignee: Botong Huang > Priority: Minor > Attachments: YARN-6955.v1.patch, YARN-6955.v2.patch > > > The timeout between AM and AMRMProxy is shorter than the timeout + failOver > between FederationInterceptor (AMRMProxy) and RM. When the first register > thread in FI is blocked because of an RM failover, AM can timeout and resend > register call, leading to two outstanding register call inside FI. > Eventually when RM comes back up, one thread succeeds register and the other > thread got an application already registered exception. FI should swallow the > exception and return success back to AM in both threads. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org