What I can say, is that we successfully use spark on yarn with kerberized
cluster. One of my coworkers also tried using it in the same way as you
are(spark standalone with kerberized cluster) but as far as I remember, he
didn't succeed. I may be wrong, because I was not personally involved in
this use case, but I think that he concluded, that every executor of spark
standalone cluster must also be kinited.

pt., 5.08.2016 o 15:54 użytkownik Aneela Saleem <ane...@platalytics.com>
napisał:

> Hi all,
>
> I'm trying to connect to Kerberized Hadoop cluster using spark job. I have
> kinit'd from command line. When i run the following job i.e.,
>
> *./bin/spark-submit --keytab /etc/hadoop/conf/spark.keytab --principal
> spark/hadoop-master@platalyticsrealm --class
> com.platalytics.example.spark.App --master spark://hadoop-master:7077
> /home/vm6/project-1-jar-with-dependencies.jar
> hdfs://hadoop-master:8020/text*
>
> I get the error:
>
> Caused by: java.io.IOException:
> org.apache.hadoop.security.AccessControlException: Client cannot
> authenticate via:[TOKEN, KERBEROS]
> at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:680)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>
> Following are the contents of *spark-defaults.conf* file:
>
> spark.master                     spark://hadoop-master:7077
> spark.eventLog.enabled           true
> spark.eventLog.dir               hdfs://hadoop-master:8020/spark/logs
> spark.serializer                 org.apache.spark.serializer.KryoSerializer
> spark.yarn.access.namenodes hdfs://hadoop-master:8020/
> spark.yarn.security.tokens.hbase.enabled true
> spark.yarn.security.tokens.hive.enabled true
> spark.yarn.principal yarn/hadoop-master@platalyticsrealm
> spark.yarn.keytab /etc/hadoop/conf/yarn.keytab
>
>
> Also i have added following in *spark-env.sh* file:
>
> HOSTNAME=`hostname -f`
> export SPARK_HISTORY_OPTS="-Dspark.history.kerberos.enabled=true
> -Dspark.history.kerberos.principal=spark/${HOSTNAME}@platalyticsrealm
> -Dspark.history.kerberos.keytab=/etc/hadoop/conf/spark.keytab"
>
>
> Please guide me, how to trace the issue?
>
> Thanks
>
>

Reply via email to