[ https://issues.apache.org/jira/browse/SPARK-3438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Patrick Wendell updated SPARK-3438: ----------------------------------- Description: Access to secured HDFS is currently supported in YARN using YARN's built in security mechanism. In YARN mode, a user application is authenticated when it is submitted, then it acquires delegation tokens and them ship them (via YARN) securely to workers. In Standalone mode, it would be nice to support a more mechanism for accessing HDFS where we rely on a single shared secret to authenticate communication in the standalone cluster. 1. A company is running a standalone cluster. 2. They are fine if all Spark jobs in the cluster share a global secret, i.e. all Spark jobs can trust one another. 3. They are able to provide a Hadoop login on the driver node via a keytab or kinit. They want tokens from this login to be distributed to the executors to allow access to secure HDFS. 4. They also don't want to trust the network on the cluster. I.e. don't want to allow someone to fetch HDFS tokens easily over a known protocol, without authentication. was:Secured HDFS is supported in YARN currently, but not in standalone mode. The tricky bit is how disseminate the delegation tokens securely in standalone mode. > Support for accessing secured HDFS in Standalone Mode > ----------------------------------------------------- > > Key: SPARK-3438 > URL: https://issues.apache.org/jira/browse/SPARK-3438 > Project: Spark > Issue Type: New Feature > Components: Deploy, Spark Core > Affects Versions: 1.0.2 > Reporter: Zhanfeng Huo > > Access to secured HDFS is currently supported in YARN using YARN's built in > security mechanism. In YARN mode, a user application is authenticated when it > is submitted, then it acquires delegation tokens and them ship them (via > YARN) securely to workers. > In Standalone mode, it would be nice to support a more mechanism for > accessing HDFS where we rely on a single shared secret to authenticate > communication in the standalone cluster. > 1. A company is running a standalone cluster. > 2. They are fine if all Spark jobs in the cluster share a global secret, i.e. > all Spark jobs can trust one another. > 3. They are able to provide a Hadoop login on the driver node via a keytab or > kinit. They want tokens from this login to be distributed to the executors to > allow access to secure HDFS. > 4. They also don't want to trust the network on the cluster. I.e. don't want > to allow someone to fetch HDFS tokens easily over a known protocol, without > authentication. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org