[ https://issues.apache.org/jira/browse/YARN-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16586390#comment-16586390 ]
Giovanni Matteo Fumarola edited comment on YARN-8673 at 8/20/18 7:24 PM: ------------------------------------------------------------------------- LGTM +1. Committed to Trunk. Thanks [~botong] . was (Author: giovanni.fumarola): LGTM +1. Committed to Trunk. > [AMRMProxy] More robust responseId resync after an YarnRM master slave switch > ----------------------------------------------------------------------------- > > Key: YARN-8673 > URL: https://issues.apache.org/jira/browse/YARN-8673 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy > Reporter: Botong Huang > Assignee: Botong Huang > Priority: Major > Attachments: YARN-8673.v1.patch, YARN-8673.v2.patch > > > After master slave switch of YarnRM, an _ApplicationNotRegisteredException_ > will be thrown from the new YarnRM. AM will re-regsiter and reset the > responseId to zero. _AMRMClientRelayer_ inside _FederationInterceptor_ > follows the same protocol, and does the automatic re-register and responseId > resync. However, when exceptions or temporary network issue happens in the > allocate call after re-register, the resync logic might be broken. This patch > improves the robustness of the process by parsing the expected repsonseId > from YarnRM exception message. So that whenever the responseId is out of sync > for whatever reason, we can automatically resync and move on. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org