[ 
https://issues.apache.org/jira/browse/YARN-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14020032#comment-14020032
 ] 

Xuan Gong commented on YARN-1779:
---------------------------------

Unfortunately, we do have AMRMToken while the RMs failover. The service name 
does not set properly during the failover. That will cause the authentication 
failure.
For example, we have two RMs, rm1 and rm2. Assume rm2 is active now, the 
applicationMaster will create the RPC connection to RM1 first (In this process, 
it will set the service name as RM1's address for the AMRMToken), and save the 
rm1'proxy object. But right now, the RM1 is standby, then it will failover to 
RM2, and do the same process but save rm2's proxy object. Currently, it will 
reset the service name as RM2's address for the AMRMToken. It works fine for 
now. When the failover happens again, it will failover to RM1. But at this 
time, it will directly read the rm1's proxy object, and it will *not* reset the 
service name. In this case, the service name is still RM2's address which will 
cause  the authentication failure when it tries to authenticate with RM1.


> Handle AMRMTokens across RM failover
> ------------------------------------
>
>                 Key: YARN-1779
>                 URL: https://issues.apache.org/jira/browse/YARN-1779
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 2.3.0
>            Reporter: Karthik Kambatla
>            Priority: Critical
>              Labels: ha
>
> Verify if AMRMTokens continue to work against RM failover. If not, we will 
> have to do something along the lines of YARN-986. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to