[ 
https://issues.apache.org/jira/browse/YARN-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681713#comment-14681713
 ] 

Dustin Cote commented on YARN-3924:
-----------------------------------

Yes, [~ajithshetty] that's the point I'm trying to get across.  The scenario 
that is problematic is:
{quote}
Configuring wrong/invalid ha.rm-ids at client is user mistake, this can be 
rechecked by user.
{quote} 

Returning "Connection Refused" gives the user no information that this is what 
happened.  Generally, I see users looking for closed ports or firewall issues 
when they see this message back, when really they've just forgotten to change 
their Oozie workflow to point to a logical RM name after enabling HA.  This 
kind of error is doubly hard to debug when it works intermittently (because 
when a failover occurs, suddenly their workflow starts working again!).  Yes, 
this is the current RM HA design, so it's not as easy as changing the message 
or exception type.  That said, I still think it's a good 
supportability/usability improvement. 

> Submitting an application to standby ResourceManager should respond better 
> than Connection Refused
> --------------------------------------------------------------------------------------------------
>
>                 Key: YARN-3924
>                 URL: https://issues.apache.org/jira/browse/YARN-3924
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Dustin Cote
>            Assignee: Ajith S
>            Priority: Minor
>
> When submitting an application directly to a standby resource manager, the 
> resource manager responds with 'Connection Refused' rather than indicating 
> that it is a standby resource manager.  Because the resource manager is aware 
> of its own state, I feel like we can have the 8032 port open for standby 
> resource managers and reject the request with something like 'Cannot process 
> application submission from this standby resource manager'.  
> This would be especially helpful for debugging oozie problems when users put 
> in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM 
> address but rather point to a specific resource manager).  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to