[ 
https://issues.apache.org/jira/browse/SPARK-3438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-3438:
-----------------------------------
    Description: 
Access to secured HDFS is currently supported in YARN using YARN's built in 
security mechanism. In YARN mode, a user application is authenticated when it 
is submitted, then it acquires delegation tokens and them ship them (via YARN) 
securely to workers.

In Standalone mode, it would be nice to support a more mechanism for accessing 
HDFS where we rely on a single shared secret to authenticate communication in 
the standalone cluster.

1. A company is running a standalone cluster.
2. They are fine if all Spark jobs in the cluster share a global secret, i.e. 
all Spark jobs can trust one another.
3. They are able to provide a Hadoop login on the driver node via a keytab or 
kinit. They want tokens from this login to be distributed to the executors to 
allow access to secure HDFS.
4. They also don't want to trust the network on the cluster. I.e. don't want to 
allow someone to fetch HDFS tokens easily over a known protocol, without 
authentication.

  was:Secured HDFS is supported in YARN currently, but not in standalone mode. 
The tricky bit is how disseminate the delegation tokens securely in standalone 
mode.


> Support for accessing secured HDFS in Standalone Mode
> -----------------------------------------------------
>
>                 Key: SPARK-3438
>                 URL: https://issues.apache.org/jira/browse/SPARK-3438
>             Project: Spark
>          Issue Type: New Feature
>          Components: Deploy, Spark Core
>    Affects Versions: 1.0.2
>            Reporter: Zhanfeng Huo
>
> Access to secured HDFS is currently supported in YARN using YARN's built in 
> security mechanism. In YARN mode, a user application is authenticated when it 
> is submitted, then it acquires delegation tokens and them ship them (via 
> YARN) securely to workers.
> In Standalone mode, it would be nice to support a more mechanism for 
> accessing HDFS where we rely on a single shared secret to authenticate 
> communication in the standalone cluster.
> 1. A company is running a standalone cluster.
> 2. They are fine if all Spark jobs in the cluster share a global secret, i.e. 
> all Spark jobs can trust one another.
> 3. They are able to provide a Hadoop login on the driver node via a keytab or 
> kinit. They want tokens from this login to be distributed to the executors to 
> allow access to secure HDFS.
> 4. They also don't want to trust the network on the cluster. I.e. don't want 
> to allow someone to fetch HDFS tokens easily over a known protocol, without 
> authentication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to