[ 
https://issues.apache.org/jira/browse/YARN-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15387117#comment-15387117
 ] 

Zhankun Tang commented on YARN-3854:
------------------------------------

[~shaneku...@gmail.com], thanks and I totally agree with your goals in the 
Docker images localization topic. And whether we use HDFS distributed cache or 
HDFS backed private repo is fine to me. I also mentioned the private repo way 
in the doc and saying that this doc and patch is a solution for the users who 
don't want to maintain private repo. I believe that a well maintained Docker 
private repo will be good choice for many people and don't need YARN to do 
extra work for it.

For the docker pull while one are pushing new version image, I think it's a 
rolling update problem. The new version should have a new tag. And the 
administrator manually rolling update the application will be ok.

Let's back to this patch. In essence, "HDFS + save/load" tries to mimic the 
private Docker repo. There are two parts to consider in the whole process. This 
patch brings in extra steps/issues due to the simplicity: 
* 1. Docker image generation and upload to a storage
** This patch uses an *extra "docker save" step* comparing with Docker on image 
generation. And it needs the *application remember the URI* while Docker  just 
need one to know the tag name after upload to the storage.
* 2. Image distribution/localization
** This patch is *distributing a tar file* through distributed cache so it's 
hard to speed up distribution by only download delta like Docker pull. And it 
consumes more network bandwidth.
** A big issue is that this patch has security risk as mentioned by 
[~sidharta-s]. Thanks Sidharta pointing this out that I don't realized before. 
Because potential tag name conflicts, different users may replace each other's 
Docker images. Currently, we cannot avoid this due to YARN have no way to 
distinguish tag names of two Docker images tar files. YARN only know this is a 
Docker image tar file, but cannot know whether load it will cause other's image 
replaced. Although there's also no tag name conflicts check when we use "docker 
push", administrator can avoid this conflicts when pushing so that each image 
has unique tag name. Anyway, it's a fact that this patch opens a hole for user 
to attack existing Docker images. One way to solve this is adding a option in 
Docker to avoid force load if the tag name is already exists.

To sum up, this patch eliminates the needs for setup private repo, but brings 
extra works to admin/application and have potential risk due to attack surface 
of Docker load. I'll raise this issue to Docker and thanks again, folks. And I 
think we should be more clear the motivation of this JIRA clearly, 
[~sidharta-s]. Thoughts?



> Add localization support for docker images
> ------------------------------------------
>
>                 Key: YARN-3854
>                 URL: https://issues.apache.org/jira/browse/YARN-3854
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: yarn
>            Reporter: Sidharta Seethana
>            Assignee: Zhankun Tang
>         Attachments: YARN-3854-branch-2.8.001.patch, 
> YARN-3854_Localization_support_for_Docker_image_v1.pdf, 
> YARN-3854_Localization_support_for_Docker_image_v2.pdf
>
>
> We need the ability to localize images from HDFS and load them for use when 
> launching docker containers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to