[ https://issues.apache.org/jira/browse/SPARK-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516856#comment-16516856 ]
Jelmer Kuperus edited comment on SPARK-5158 at 6/19/18 9:33 AM: ---------------------------------------------------------------- I ended up with the following workaround which at first glance seems to work 1. create a _.java.login.config_ file in the home directory of the spark with the following contents {noformat} com.sun.security.jgss.krb5.initiate { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true useTicketCache="true" ticketCache="/tmp/krb5cc_0" keyTab="/path/to/my.keytab" principal="u...@foo.com"; };{noformat} 2. put a krb5.conf file in /etc/krb5.conf 3. place your hadoop configuration in /etc/hadoop/conf and in `core-site.xml` set : * fs.defaultFS to webhdfs://your_hostname:14000/webhdfs/v1 * hadoop.security.authentication to kerberos * hadoop.security.authorization to true 4. make sure the hadoop config gets is on the classpath of spark. Eg the process should have something like this in it {noformat} -cp /etc/spark/:/usr/share/spark/jars/*:/etc/hadoop/conf/{noformat} This configures a single principal for the enire spark process. If you want to change the default paths to the configuration files you can use {noformat} -Djava.security.krb5.conf=/etc/krb5.conf -Djava.security.auth.login.config=/path/to/jaas.conf{noformat} was (Author: jelmer): I ended up with the following workaround which at first glance seems to work 1. create a `.java.login.config` file in the home directory of the spark with the following contents {noformat} com.sun.security.jgss.krb5.initiate { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true useTicketCache="true" ticketCache="/tmp/krb5cc_0" keyTab="/path/to/my.keytab" principal="u...@foo.com"; };{noformat} 2. put a krb5.conf file in /etc/krb5.conf 3. place your hadoop configuration in /etc/hadoop/conf and in `core-site.xml` set : * fs.defaultFS to webhdfs://your_hostname:14000/webhdfs/v1 * hadoop.security.authentication to kerberos * hadoop.security.authorization to true 4. make sure the hadoop config gets is on the classpath of spark. Eg the process should have something like this in it {noformat} -cp /etc/spark/:/usr/share/spark/jars/*:/etc/hadoop/conf/{noformat} This configures a single principal for the enire spark process. If you want to change the default paths to the configuration files you can use {noformat} -Djava.security.krb5.conf=/etc/krb5.conf -Djava.security.auth.login.config=/path/to/jaas.conf{noformat} > Allow for keytab-based HDFS security in Standalone mode > ------------------------------------------------------- > > Key: SPARK-5158 > URL: https://issues.apache.org/jira/browse/SPARK-5158 > Project: Spark > Issue Type: New Feature > Components: Spark Core > Reporter: Patrick Wendell > Assignee: Matthew Cheah > Priority: Critical > > There have been a handful of patches for allowing access to Kerberized HDFS > clusters in standalone mode. The main reason we haven't accepted these > patches have been that they rely on insecure distribution of token files from > the driver to the other components. > As a simpler solution, I wonder if we should just provide a way to have the > Spark driver and executors independently log in and acquire credentials using > a keytab. This would work for users who have a dedicated, single-tenant, > Spark clusters (i.e. they are willing to have a keytab on every machine > running Spark for their application). It wouldn't address all possible > deployment scenarios, but if it's simple I think it's worth considering. > This would also work for Spark streaming jobs, which often run on dedicated > hardware since they are long-running services. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org