[ 
https://issues.apache.org/jira/browse/YARN-8265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472693#comment-16472693
 ] 

Billie Rinaldi commented on YARN-8265:
--------------------------------------

How to test this bug:
# Run a simple app with a sleep and exit launch command.
{noformat}
{
  "name": "test-ip-change",
  "version": "1",
  "lifetime": "3600",
  "configuration": {
    "properties": {
      "docker.network": "bridge"
    }
  },
  "components" :
    [
      {
        "name": "centos7",
        "number_of_containers": 1,
        "artifact": {
          "id": "library/centos:7",
          "type": "DOCKER"
        },
        "launch_command": "sleep 60; exit 1",
        "resource": {
          "cpus": 2,
          "memory": "1024"
        }
      }
    ]
}
{noformat}
# Verify that the docker container has started running, and then run the 
following script (assuming you only have one docker container running in your 
test environment). This will grab the IP of the service's container as soon as 
that container goes down. Leave this docker container running.
{noformat}
while docker ps | grep container_ > /dev/null; do :; done; docker run -it 
library/centos:7 bash
{noformat}
# After the service's container is restarted, verify that it has a new IP. 
Before the patch is applied, verify the AM and RegistryDNS have not received 
the new IP of the container. After the patch is applied, verify that they do 
receive the new IP.

> AM should retrieve new IP for restarted container
> -------------------------------------------------
>
>                 Key: YARN-8265
>                 URL: https://issues.apache.org/jira/browse/YARN-8265
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: yarn-native-services
>    Affects Versions: 3.1.0
>            Reporter: Eric Yang
>            Assignee: Billie Rinaldi
>            Priority: Critical
>         Attachments: YARN-8265.001.patch, YARN-8265.002.patch
>
>
> When a docker container is restarted, it gets a new IP, but the service AM 
> only retrieves one IP for a container and then cancels the container status 
> retriever. I suspect the issue would be solved by restarting the retriever 
> (if it has been canceled) when the onContainerRestart callback is received, 
> but we'll have to do some testing to make sure this works.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to