[ 
https://issues.apache.org/jira/browse/YARN-8265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469147#comment-16469147
 ] 

Billie Rinaldi commented on YARN-8265:
--------------------------------------

[~eyang], thanks for working on this patch. There seem to be two problems; one 
is that a BECOME_READY event does not start the container status retriever, but 
a bigger problem is that I think I was mistaken about when onContainerRestart 
is received. It looks like it is only received after the AM initiates a 
container restart, not after the NM relaunches a container. I don't see an NM 
client callback for informing the AM when the NM has decided to perform a 
container relaunch. So, I'll have to think about whether there is a workaround 
we could do in the AM, or if we'll just have to wait to fix this issue until a 
new NM callback is implemented.

> AM should retrieve new IP for restarted container
> -------------------------------------------------
>
>                 Key: YARN-8265
>                 URL: https://issues.apache.org/jira/browse/YARN-8265
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: yarn-native-services
>    Affects Versions: 3.1.0
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>            Priority: Critical
>             Fix For: 3.2.0, 3.1.1
>
>         Attachments: YARN-8265.001.patch
>
>
> When a docker container is restarted, it gets a new IP, but the service AM 
> only retrieves one IP for a container and then cancels the container status 
> retriever. I suspect the issue would be solved by restarting the retriever 
> (if it has been canceled) when the onContainerRestart callback is received, 
> but we'll have to do some testing to make sure this works.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to