[ 
https://issues.apache.org/jira/browse/YARN-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15617186#comment-15617186
 ] 

Vinod Kumar Vavilapalli commented on YARN-5428:
-----------------------------------------------

[[email protected]], [~vvasudev], [~tangzhankun],

I read through the attached design doc. My thoughts follow.

----

We already have an existing model on how to deal with artifact localization:

A client (end-user clients or ApplicationMasters) ask NodeManagers to launch 
containers

As part of launching containers, they specify two things in order for the 
NodeManager to download the requisite artifacts (java API _LocalResource_):
 # The list of artifacts - files, jars, tar-balls etc each qualified as a URI
    -- The URI dictates to the NodeManager where the artifact is and the schema 
informs how to get to that path
    -- NodeManager today only understands Hadoop FileSystem specific URIs.
 # In addition to the URI, a bunch of credentials that the user himself/herself 
passes along to the NodeManager saying "here, use these credentials to talk to 
the remote storage, representing me, and download the files on the local box. 
By the way, keep the credentials safe."

----

Why can't we extend this mechanism from the API perspective? Kind of your 
approach (4.2) in the doc.
 # Extend the URL notion to include docker paths
 # Pass the login credentials also along similar to our Container credentials 
today. Obviously, this needs more plumbing to be able to pass along docker 
login credentials.
 # Other non-secretive parameters can be passed along as another configuration 
file in distributed-cache. This obviously creates another need for yet-another 
storage besides docker, but we already need this to be able to kickstart the 
AM, spread around more config files etc.

Your argument about every framework needing to do some work is valid. But 
almost all of today's frameworks already have user-defined configuration 
properties to specify additional distributed-cache files without changing code 
everytime.

----

And finally, today, NodeManager doesn't have a notion of a _default artifacts 
store_. Every app specifies where its files are. We should try to keep this. If 
not, we should add a first class notion of _default artifacts store_ both for 
docker and non-docker containers.

> Allow for specifying the docker client configuration directory
> --------------------------------------------------------------
>
>                 Key: YARN-5428
>                 URL: https://issues.apache.org/jira/browse/YARN-5428
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: yarn
>            Reporter: Shane Kumpf
>            Assignee: Shane Kumpf
>              Labels: oct16-medium
>         Attachments: YARN-5428.001.patch, YARN-5428.002.patch, 
> YARN-5428.003.patch, YARN-5428.004.patch, 
> YARN-5428Allowforspecifyingthedockerclientconfigurationdirectory.pdf
>
>
> The docker client allows for specifying a configuration directory that 
> contains the docker client's configuration. It is common to store "docker 
> login" credentials in this config, to avoid the need to docker login on each 
> cluster member. 
> By default the docker client config is $HOME/.docker/config.json on Linux. 
> However, this does not work with the current container executor user 
> switching and it may also be desirable to centralize this configuration 
> beyond the single user's home directory.
> Note that the command line arg is for the configuration directory NOT the 
> configuration file.
> This change will be needed to allow YARN to automatically pull images at 
> localization time or within container executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to