[ 
https://issues.apache.org/jira/browse/YARN-9647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

KWON BYUNGCHANG updated YARN-9647:
----------------------------------
    Attachment: YARN-9647.002.patch

> Docker launch fails when local-dirs or log-dirs is unhealthy.
> -------------------------------------------------------------
>
>                 Key: YARN-9647
>                 URL: https://issues.apache.org/jira/browse/YARN-9647
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 3.1.2
>            Reporter: KWON BYUNGCHANG
>            Priority: Major
>         Attachments: YARN-9647.001.patch, YARN-9647.002.patch
>
>
> my /etc/hadoop/conf/container-executor.cfg
> {code}
> [docker]
>    docker.allowed.ro-mounts=/data1/hadoop/yarn/local,/data2/hadoop/yarn/local
>    docker.allowed.rw-mounts=/data1/hadoop/yarn/local,/data2/hadoop/yarn/local
> {code}
> if /data2 is unhealthy, docker launch fails  although container can use 
> /data1 as local-dir, log-dir 
> error message is below
> {code}
> [2019-06-25 14:55:26.168]Exception from container-launch. Container id: 
> container_e50_1561100493387_5185_01_000597 Exit code: 29 Exception message: 
> Launch container failed Shell error output: Could not determine real path of 
> mount '/data2/hadoop/yarn/local' Could not determine real path of mount 
> '/data2/hadoop/yarn/local' Unable to find permitted docker mounts on disk 
> Error constructing docker command, docker error code=16, error message='Mount 
> access error' Shell output: main : command provided 4 main : run as user is 
> magnum main : requested yarn user is magnum Creating script paths... Creating 
> local dirs... [2019-06-25 14:55:26.189]Container exited with a non-zero exit 
> code 29. [2019-06-25 14:55:26.192]Container exited with a non-zero exit code 
> 29. 
> {code}
> root cause is that normalize_mounts() in docker-util.c return -1  because it 
> cannot resolve real path of /data2/hadoop/yarn/local.(note that /data2 is 
> disk fault  at this point)
> however disk of nm local dirs and nm log dirs can fail at any time.
> docker launch should succeed if there are available local dirs and log dirs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to