[jira] [Comment Edited] (SPARK-5158) Allow for keytab-based HDFS security in Standalone mode

Jelmer Kuperus (JIRA) Tue, 19 Jun 2018 02:34:07 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516856#comment-16516856
 ]


Jelmer Kuperus edited comment on SPARK-5158 at 6/19/18 9:33 AM:
----------------------------------------------------------------

I ended up with the following workaround which at first glance seems to work

1. create a _.java.login.config_ file in the home directory of the spark with 
the following contents
{noformat}
com.sun.security.jgss.krb5.initiate {
  com.sun.security.auth.module.Krb5LoginModule required
  useKeyTab=true
  useTicketCache="true"
  ticketCache="/tmp/krb5cc_0"
  keyTab="/path/to/my.keytab"
  principal="u...@foo.com";
};{noformat}
2. put a krb5.conf file in /etc/krb5.conf

3. place your hadoop configuration in /etc/hadoop/conf and in `core-site.xml` 
set : 
 * fs.defaultFS to webhdfs://your_hostname:14000/webhdfs/v1
 * hadoop.security.authentication to kerberos
 * hadoop.security.authorization to true

4. make sure the hadoop config gets is on the classpath of spark. Eg the 
process should have something like this in it
{noformat}
-cp /etc/spark/:/usr/share/spark/jars/*:/etc/hadoop/conf/{noformat}
 

This configures a single principal for the enire spark process. If you want to 
change the default paths to the configuration files you can use
{noformat}
-Djava.security.krb5.conf=/etc/krb5.conf 
-Djava.security.auth.login.config=/path/to/jaas.conf{noformat}
 

 

 


was (Author: jelmer):
I ended up with the following workaround which at first glance seems to work

1. create a `.java.login.config` file in the home directory of the spark with 
the following contents
{noformat}
com.sun.security.jgss.krb5.initiate {
  com.sun.security.auth.module.Krb5LoginModule required
  useKeyTab=true
  useTicketCache="true"
  ticketCache="/tmp/krb5cc_0"
  keyTab="/path/to/my.keytab"
  principal="u...@foo.com";
};{noformat}
2. put a krb5.conf file in /etc/krb5.conf

3. place your hadoop configuration in /etc/hadoop/conf and in `core-site.xml` 
set : 
 * fs.defaultFS to webhdfs://your_hostname:14000/webhdfs/v1
 * hadoop.security.authentication to kerberos
 * hadoop.security.authorization to true

4. make sure the hadoop config gets is on the classpath of spark. Eg the 
process should have something like this in it
{noformat}
-cp /etc/spark/:/usr/share/spark/jars/*:/etc/hadoop/conf/{noformat}
 

This configures a single principal for the enire spark process. If you want to 
change the default paths to the configuration files you can use
{noformat}
-Djava.security.krb5.conf=/etc/krb5.conf 
-Djava.security.auth.login.config=/path/to/jaas.conf{noformat}
 

 

 

> Allow for keytab-based HDFS security in Standalone mode
> -------------------------------------------------------
>
>                 Key: SPARK-5158
>                 URL: https://issues.apache.org/jira/browse/SPARK-5158
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>            Reporter: Patrick Wendell
>            Assignee: Matthew Cheah
>            Priority: Critical
>
> There have been a handful of patches for allowing access to Kerberized HDFS 
> clusters in standalone mode. The main reason we haven't accepted these 
> patches have been that they rely on insecure distribution of token files from 
> the driver to the other components.
> As a simpler solution, I wonder if we should just provide a way to have the 
> Spark driver and executors independently log in and acquire credentials using 
> a keytab. This would work for users who have a dedicated, single-tenant, 
> Spark clusters (i.e. they are willing to have a keytab on every machine 
> running Spark for their application). It wouldn't address all possible 
> deployment scenarios, but if it's simple I think it's worth considering.
> This would also work for Spark streaming jobs, which often run on dedicated 
> hardware since they are long-running services.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-5158) Allow for keytab-based HDFS security in Standalone mode

Reply via email to