Question about standalone Spark cluster reading from Kerberosed hadoop

Mu Kong Fri, 23 Jun 2017 02:10:59 -0700

Hi, all!

I was trying to read from a Kerberosed hadoop cluster from a standalone
spark cluster.
Right now, I encountered some authentication issues with Kerberos:



java.io.IOException: Failed on local exception: java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]; Host Details : local host is:
"XXXXXXXXXXXX"; destination host is: XXXXXXXXXXXXXXX;



I checked with klist, and principle/realm is correct.
I also used hdfs command line to poke HDFS from all the nodes, and it
worked.
And if I submit job using local(client) mode, the job worked fine.

I tried to put everything from hadoop/conf to spark/conf and hive/conf to
spark/conf.
Also tried edit spark/conf/spark-env.sh to add
SPARK_SUBMIT_OPTS/SPARK_MASTER_OPTS/SPARK_SLAVE_OPTS/HADOOP_CONF_DIR/HIVE_CONF_DIR,
and tried to export them in .bashrc as well.

However, I'm still experiencing the same exception.

Then I read some concerning posts about problems with
kerberosed hadoop, some post like the following one:
http://blog.stratio.com/spark-kerberos-safe-story/
, which indicates that we can not access to kerberosed hdfs using
standalone spark cluster.

I'm using spark 2.1.1, is it still the case that we can't access kerberosed
hdfs with 2.1.1?

Thanks!


Best regards,
Mu

Question about standalone Spark cluster reading from Kerberosed hadoop

Reply via email to