[ https://issues.apache.org/jira/browse/YARN-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15387117#comment-15387117 ]
Zhankun Tang commented on YARN-3854: ------------------------------------ [~shaneku...@gmail.com], thanks and I totally agree with your goals in the Docker images localization topic. And whether we use HDFS distributed cache or HDFS backed private repo is fine to me. I also mentioned the private repo way in the doc and saying that this doc and patch is a solution for the users who don't want to maintain private repo. I believe that a well maintained Docker private repo will be good choice for many people and don't need YARN to do extra work for it. For the docker pull while one are pushing new version image, I think it's a rolling update problem. The new version should have a new tag. And the administrator manually rolling update the application will be ok. Let's back to this patch. In essence, "HDFS + save/load" tries to mimic the private Docker repo. There are two parts to consider in the whole process. This patch brings in extra steps/issues due to the simplicity: * 1. Docker image generation and upload to a storage ** This patch uses an *extra "docker save" step* comparing with Docker on image generation. And it needs the *application remember the URI* while Docker just need one to know the tag name after upload to the storage. * 2. Image distribution/localization ** This patch is *distributing a tar file* through distributed cache so it's hard to speed up distribution by only download delta like Docker pull. And it consumes more network bandwidth. ** A big issue is that this patch has security risk as mentioned by [~sidharta-s]. Thanks Sidharta pointing this out that I don't realized before. Because potential tag name conflicts, different users may replace each other's Docker images. Currently, we cannot avoid this due to YARN have no way to distinguish tag names of two Docker images tar files. YARN only know this is a Docker image tar file, but cannot know whether load it will cause other's image replaced. Although there's also no tag name conflicts check when we use "docker push", administrator can avoid this conflicts when pushing so that each image has unique tag name. Anyway, it's a fact that this patch opens a hole for user to attack existing Docker images. One way to solve this is adding a option in Docker to avoid force load if the tag name is already exists. To sum up, this patch eliminates the needs for setup private repo, but brings extra works to admin/application and have potential risk due to attack surface of Docker load. I'll raise this issue to Docker and thanks again, folks. And I think we should be more clear the motivation of this JIRA clearly, [~sidharta-s]. Thoughts? > Add localization support for docker images > ------------------------------------------ > > Key: YARN-3854 > URL: https://issues.apache.org/jira/browse/YARN-3854 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn > Reporter: Sidharta Seethana > Assignee: Zhankun Tang > Attachments: YARN-3854-branch-2.8.001.patch, > YARN-3854_Localization_support_for_Docker_image_v1.pdf, > YARN-3854_Localization_support_for_Docker_image_v2.pdf > > > We need the ability to localize images from HDFS and load them for use when > launching docker containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org