[
https://issues.apache.org/jira/browse/YARN-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15617186#comment-15617186
]
Vinod Kumar Vavilapalli commented on YARN-5428:
-----------------------------------------------
[[email protected]], [~vvasudev], [~tangzhankun],
I read through the attached design doc. My thoughts follow.
----
We already have an existing model on how to deal with artifact localization:
A client (end-user clients or ApplicationMasters) ask NodeManagers to launch
containers
As part of launching containers, they specify two things in order for the
NodeManager to download the requisite artifacts (java API _LocalResource_):
# The list of artifacts - files, jars, tar-balls etc each qualified as a URI
-- The URI dictates to the NodeManager where the artifact is and the schema
informs how to get to that path
-- NodeManager today only understands Hadoop FileSystem specific URIs.
# In addition to the URI, a bunch of credentials that the user himself/herself
passes along to the NodeManager saying "here, use these credentials to talk to
the remote storage, representing me, and download the files on the local box.
By the way, keep the credentials safe."
----
Why can't we extend this mechanism from the API perspective? Kind of your
approach (4.2) in the doc.
# Extend the URL notion to include docker paths
# Pass the login credentials also along similar to our Container credentials
today. Obviously, this needs more plumbing to be able to pass along docker
login credentials.
# Other non-secretive parameters can be passed along as another configuration
file in distributed-cache. This obviously creates another need for yet-another
storage besides docker, but we already need this to be able to kickstart the
AM, spread around more config files etc.
Your argument about every framework needing to do some work is valid. But
almost all of today's frameworks already have user-defined configuration
properties to specify additional distributed-cache files without changing code
everytime.
----
And finally, today, NodeManager doesn't have a notion of a _default artifacts
store_. Every app specifies where its files are. We should try to keep this. If
not, we should add a first class notion of _default artifacts store_ both for
docker and non-docker containers.
> Allow for specifying the docker client configuration directory
> --------------------------------------------------------------
>
> Key: YARN-5428
> URL: https://issues.apache.org/jira/browse/YARN-5428
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: yarn
> Reporter: Shane Kumpf
> Assignee: Shane Kumpf
> Labels: oct16-medium
> Attachments: YARN-5428.001.patch, YARN-5428.002.patch,
> YARN-5428.003.patch, YARN-5428.004.patch,
> YARN-5428Allowforspecifyingthedockerclientconfigurationdirectory.pdf
>
>
> The docker client allows for specifying a configuration directory that
> contains the docker client's configuration. It is common to store "docker
> login" credentials in this config, to avoid the need to docker login on each
> cluster member.
> By default the docker client config is $HOME/.docker/config.json on Linux.
> However, this does not work with the current container executor user
> switching and it may also be desirable to centralize this configuration
> beyond the single user's home directory.
> Note that the command line arg is for the configuration directory NOT the
> configuration file.
> This change will be needed to allow YARN to automatically pull images at
> localization time or within container executor.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]