[ 
https://issues.apache.org/jira/browse/YARN-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139424#comment-14139424
 ] 

Junping Du commented on YARN-2561:
----------------------------------

bq. It would be better to check the list of applications on the node. 
Sounds good. Update it in v5 patch.

bq. When a node reconnects with no containers but has the same port, we aren't 
updating it's potentially new totalCapability as we did before.
Yes. Update with adding with newNode instead of old rmNode.

> MR job client cannot reconnect to AM after NM restart.
> ------------------------------------------------------
>
>                 Key: YARN-2561
>                 URL: https://issues.apache.org/jira/browse/YARN-2561
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: Tassapol Athiapinya
>            Assignee: Junping Du
>            Priority: Blocker
>         Attachments: YARN-2561-v2.patch, YARN-2561-v3.patch, 
> YARN-2561-v4.patch, YARN-2561-v5.patch, YARN-2561.patch
>
>
> Work-preserving NM restart is disabled.
> Submit a job. Restart the only NM and found that Job will hang with connect 
> retries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to