Botong Huang created YARN-8673: ---------------------------------- Summary: [AMRMProxy] More robust responseId resync after an YarnRM master slave switch Key: YARN-8673 URL: https://issues.apache.org/jira/browse/YARN-8673 Project: Hadoop YARN Issue Type: Task Reporter: Botong Huang Assignee: Botong Huang
After master slave switch of YarnRM, an _ApplicationNotRegisteredException_ will be thrown from the new YarnRM. AM will re-regsiter and reset the responseId to zero. _AMRMClientRelayer_ inside _FederationInterceptor_ follows the same protocol, and does the automatic re-register and responseId resync. However, when exceptions or temporary network issue happens in the allocate call after re-register, the resync logic might be broken. This patch improves the robustness of the process by parsing the expected repsonseId from YarnRM exception message. So that whenever the responseId is out of sync for whatever reason, we can automatically resync and move on. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org