[ 
https://issues.apache.org/jira/browse/YARN-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681526#comment-14681526
 ] 

Rohith Sharma K S commented on YARN-3924:
-----------------------------------------

If I see from user side, user would be expecting that to differente exception 
for *connecting to Standby RM* and *connecting to invalid/not started 
resourceManager address*. But as per RM HA design, both the scenario's are 
treated as same. The reason is StandBy RM does not opens any rpc server for 
client communication. If the client is trying to submit a job, then client 
retry for certain amout of time for both configured rm.ha-ids and throw 
connectionRefused exception.
There are 2 possibilities client might throw connection refused
# Configuring wrong/invalid *ha.rm-ids* at client is user mistake, this can be 
rechecked by user.
# Both RM's are in StandBy for long time is problem from YARN and need to find 
the reason for this state. Ideally if any issue with ZK, after sometime RM will 
shutdown. If you can share logs for both RM's in standBy would be helpful for 
analysis.


> Submitting an application to standby ResourceManager should respond better 
> than Connection Refused
> --------------------------------------------------------------------------------------------------
>
>                 Key: YARN-3924
>                 URL: https://issues.apache.org/jira/browse/YARN-3924
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Dustin Cote
>            Assignee: Ajith S
>            Priority: Minor
>
> When submitting an application directly to a standby resource manager, the 
> resource manager responds with 'Connection Refused' rather than indicating 
> that it is a standby resource manager.  Because the resource manager is aware 
> of its own state, I feel like we can have the 8032 port open for standby 
> resource managers and reject the request with something like 'Cannot process 
> application submission from this standby resource manager'.  
> This would be especially helpful for debugging oozie problems when users put 
> in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM 
> address but rather point to a specific resource manager).  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to