Hi everyone, I'm actually facing a weird situation with the hivecontext and Kerberos on yarn-client mode. actual configuration: HDP 2.2 ( Hive 0.14 , HDFS 2.6 , yarn 2.6 ) - SPARK 1.5.2 and HA namenode activated - Kerberos enabled
Situation : In the same spark context, I do receive "random" kerberos errors. I assume it is random because when I retry in the same context (session) I do arrive to receive my result. I checked the date between server and my ntp are correct no time gap. [error on] Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 6, [datanode fqdn]): java.io.IOException: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: "[datanode fqdn]/[datanode 9 address]"; destination host is: "[namenode fqdn]":8020; [error off] [code on] val res = sqlContext.sql("SELECT * FROM foo.song") val counts = res.map(row => row.getString(0).replace("""([\p{Punct}&&[^.@]]|\b\p{IsLetter}{1,2}\b)\s*""", "")).flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey(_ + _) counts.collect() [code off] I tried alos on several tables and databases ( around 40 different tables with different strucure and size) stored in my hiveserver and I receive the same error always randomely ( because when I try the same query a moment after, it does function). Also, I tried a stupid test, because I'm on HA namenode, I activate the nn2 in place of the nn1 and then my query did function BUT without any result !!!!! +--------------------+ | sentence| +--------------------+ +--------------------+ | 0| +--------------------+ I would not believe that my hive-site.xml is incorrect because as I said, I can access the data, time to time. What does function : Hive shell beeline ( connection and queries) spark-shell sc.textFile operations Logging inspection: nothing in yarn logs no error in hiveserver2.logs My question: how does the hivecontext function with the kerberos ticket? and how can I solve my problem or at least have more detail to debug it? In advance, thank you for your answers. Best, Configuration detail: spark config : symbolic link of hive-site.xml to my spark conf dir : ln -s /etc/hive/conf/hive-site.xml /home/myuser/spark/conf/ sh-4.1$ more java-opts -Dhdp.version=2.2.4.2-2 sh-4.1$ more spark-env.sh export HADOOP_HOME=/etc/hadoop/conf export HADOOP_CONF_DIR=/etc/hadoop/conf sh-4.1$ more spark-default.sh spark.driver.extraJavaOptions -Dhdp.version=2.2.4.4-16 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps spark.yarn.am.extraJavaOptions -Dhdp.version=2.2.4.4-16 spark.yarn.dist.files /usr/hdp/2.2.4.4-16/spark/conf/metrics.properties spark launch methode in yarn-client mode: kinit export JAVA_HOME=/usr/lib/jvm/jre-1.7.0-openjdk.x86_64 export YARN_CONF_DIR=/etc/hadoop/conf export SPARK_HOME=/home/myuser/spark export JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:/usr/hdp/2.2.0.0-2041/hadoop/lib/native export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/hdp/2.2.0.0-2041/hadoop/lib/native export SPARK_YARN_USER_ENV="JAVA_LIBRARY_PATH=$JAVA_LI:BRARY_PATH,LD_LIBRARY_PATH=$LD_LIBRARY_PATH" /home/myuser/spark/bin/spark-shell --master yarn-client --driver-memory 8g --executor-memory 8g --num-executors 4 --executor-cores 5 --conf "spark.storage.memoryFraction=0.2" --conf "spark.rdd.compress=true" --conf "spark.serializer=org.apache.spark.serializer.KryoSerializer" --conf "spark.kryoserializer.buffer.max=512" -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/sqlContext-Client-cannot-authenticate-via-TOKEN-KERBEROS-tp25848.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org