[ 
https://issues.apache.org/jira/browse/YARN-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16798241#comment-16798241
 ] 

Eric Yang commented on YARN-5670:
---------------------------------

{quote}The cache can be backed by NMStateStore so even when the NM comes back 
or is restarted, it will know what images it localized.{quote}

The problem is not related to LRU is persisted by NMStateStore.  The problem is 
related to docker image tags are moving targets.  Let's consider that node 
manager tracks images by name and tags combo.  centos:latest has digest id: 
123.  This image was used yesterday.  The image is updated to digest id: 234 
today.  When NM delete centos:latest tomorrow because it has not been in use 
for 24 hours.  Image with digest id: 123 will not be deleted because it is no 
longer associated with the same name from two days ago.

Let's take another view, if image is tracked by digest id by NM.  System admin 
tagged centos:latest (digest id: 123) to private_image:my_version.  He is 
hoping that no one will delete his image.  A job started with centos:latest, 
and resolved to digest id: 123.  Centos:latest updated to digest id: 234 by 
another job a few hours later.  24 hours later, private_image:my_version is 
deleted by digest id 123 clean up job because there is only 
private_image:my_version referenced with digest id 123.

Hadoop 3.1.x and 3.2.x don't have clean up ability.  Therefore, dangling images 
are already accumulating in production systems.  There is no way to identify 
images that were pulled by the system or placed by admin, this makes option 1 
less attractive to implement because it can not reach the desired clean state 
without undesired side effects above.

Option 2 is safer to implement in Hadoop because giving system admin an option 
to turn on.  They can be more prepared in their internal infrastructure setup 
to be less of a one off and reach the same definition of clean state that 
[Docker swarm uses with system prune|https://github.com/moby/moby/issues/31254].

> Add support for Docker image clean up
> -------------------------------------
>
>                 Key: YARN-5670
>                 URL: https://issues.apache.org/jira/browse/YARN-5670
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: yarn
>            Reporter: Zhankun Tang
>            Assignee: Eric Yang
>            Priority: Major
>              Labels: Docker
>         Attachments: Localization Support For Docker Images_002.pdf
>
>
> Regarding to Docker image localization, we also need a way to clean up the 
> old/stale Docker image to save storage space. We may extend deletion service 
> to utilize "docker rm" to do this.
> This is related to YARN-3854 and may depend on its implementation. Please 
> refer to YARN-3854 for Docker image localization details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to