[ 
https://issues.apache.org/jira/browse/YARN-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629691#comment-15629691
 ] 

Shane Kumpf commented on YARN-5818:
-----------------------------------

Did some initial testing here and unfortunately, given that docker is a 
client/server model, when the docker daemon is down for restart/upgrade, client 
operations fail with an EOF exception. Our use of {{docker wait}} for 
retrieving the containers exit code breaks down as the client operation 
failures during the restart/upgrade.
{code}
An error occurred trying to connect: Post 
http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/c11692777816e44049d610c4ad358a24eefbff707cdbd85c24df3d153c80401e/wait:
 EOF
{code}

The docker community believes this is working as intended and does not plan to 
fix this behavior. It appears we will have to handle retries in c-e.

> Support the Docker Live Restore feature
> ---------------------------------------
>
>                 Key: YARN-5818
>                 URL: https://issues.apache.org/jira/browse/YARN-5818
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: yarn
>            Reporter: Shane Kumpf
>
> Docker 1.12.x introduced the docker [Live 
> Restore|https://docs.docker.com/engine/admin/live-restore/] feature which 
> allows docker containers to survive docker daemon restarts/upgrades. Support 
> for this feature should be added to YARN to allow docker changes and upgrades 
> to be less impactful to existing containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to