[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341761#comment-14341761
 ] 

Abin Shahab commented on YARN-3080:
-----------------------------------

[~chenchun], Also, other threads in the NM may depend on the actual PID of the 
launched JVM, and I'm not going to preclude any future processes from depending 
upon this. The pidfile is supposed to be populated right after the container is 
launched, not after the process has finished. The DockerContainerExecutor will 
not alter these fundamental api and expectations of the NM from a container 
executor. Therefore, getting the actual PID from docker inspect is the most 
accurate and recommended way.


> The DockerContainerExecutor could not write the right pid to container pidFile
> ------------------------------------------------------------------------------
>
>                 Key: YARN-3080
>                 URL: https://issues.apache.org/jira/browse/YARN-3080
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.6.0
>            Reporter: Beckham007
>            Assignee: Abin Shahab
>         Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
> YARN-3080.patch
>
>
> The docker_container_executor_session.sh is like this:
> {quote}
> #!/usr/bin/env bash
> echo `/usr/bin/docker inspect --format {{.State.Pid}} 
> container_1421723685222_0008_01_000002` > 
> /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_000002/container_1421723685222_0008_01_000002.pid.tmp
> /bin/mv -f 
> /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_000002/container_1421723685222_0008_01_000002.pid.tmp
>  
> /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_000002/container_1421723685222_0008_01_000002.pid
> /usr/bin/docker run --rm  --name container_1421723685222_0008_01_000002 -e 
> GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
> GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
> GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
> GAIA_CONTAINER_ID=container_1421723685222_0008_01_000002 --memory=32M 
> --cpu-shares=1024 -v 
> /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_000002:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_000002
>  -v 
> /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_000002:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_000002
>  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
> "/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_000002/launch_container.sh"
> {quote}
> The DockerContainerExecutor use docker inspect before docker run, so the 
> docker inspect couldn't get the right pid for the docker, signalContainer() 
> and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to