[ https://issues.apache.org/jira/browse/SPARK-20060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959349#comment-15959349 ]
Marcelo Vanzin commented on SPARK-20060: ---------------------------------------- Adding a link to SPARK-5158; they're not exactly duplicates, but maybe they should be. In both cases, a spec should be written explaining what is being done, with an explanation of how it is secure. > Support Standalone visiting secured HDFS > ----------------------------------------- > > Key: SPARK-20060 > URL: https://issues.apache.org/jira/browse/SPARK-20060 > Project: Spark > Issue Type: New Feature > Components: Deploy, Spark Core > Affects Versions: 2.2.0 > Reporter: Kent Yao > > h1. Brief design > h2. Introductions > The basic issue for Standalone mode to visit kerberos secured HDFS or other > kerberized Services is how to gather the delegated tokens on the driver side > and deliver them to the executor side. > When we run Spark on Yarn, we set the tokens to the container launch context > to deliver them automatically and for long-term running issue caused by token > expiration, we have it fixed with SPARK-14743 by writing the tokens to HDFS > and updating the credential file and renewing them over and over. > When run Spark On Standalone, we currently have no implementations like Yarn > to get and deliver those tokens. > h2. Implementations > Firstly, we simply move the implementation of SPARK-14743 which is only for > yarn to core module. And we use it to gather the credentials we need, and > also we use it to update and renew with credential files on HDFS. > Secondly, credential files on secured HDFS are reachable for executors before > they get the tokens. Here we add a sequence configuration > `spark.deploy.credential. entities` which is used by the driver to put > `token.encodeToUrlString()` before launching the executors, and used by the > executors to fetch the credential as a string sequence during fetching the > driver side spark properties, and then decode them to tokens. Before setting > up the `CoarseGrainedExecutorBackend` we set the credentials to current > executor side ugi. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org