I am trying to setup my IDE to a scala spark application. I want to access HDFS files from remote Hadoop server that has Kerberos enabled. My understanding is I should be able to do that from Spark. Here is my code so far:
val sparkConf = new SparkConf().setAppName(appName).setMaster(master); if(jars.length>0) { sparkConf.setJars(jars); } if(!properties.isEmpty) { //val iter = properties.keys.iterator for((k,v)<-properties) sparkConf.set(k, v); } else { sparkConf .set("spark.executor.memory", "1024m") .set("spark.cores.max", "1") .set("spark.default.parallelism", "4"); } try { if(!StringUtils.isBlank(principal) && !StringUtils.isBlank(keytab)) { //UserGroupInformation.setConfiguration(config); UserGroupInformation.loginUserFromKeytab(principal, keytab); } } catch { case ioe:IOException =>{ println("Failed to login to Hadoop [principal = " + principal + ", keytab = " + keytab + "]"); ioe.printStackTrace();} } val sc = new SparkContext(sparkConf) val MY_FILE: String = "hdfs://remoteserver:port/file.out" val rDD = sc.textFile(MY_FILE,10) println("Lines "+rDD.count); I have core-site.xml in my classpath. I changed hadoop.ssl.enabled to false as it was expecting a secret key. The principal I am using is correct. I tried username/_HOST@fully.qualified.domain and username@fully.qualified.domain with no success. I tried running spark in local mode and yarn client mode. I am hoping someone has a recipe/solved this problem. Any pointers to help setup/debug this problem will be helpful. I am getting following error message: Caused by: java.lang.IllegalArgumentException: Can't get Kerberos realm at org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:65) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:227) at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:249) at org.apache.spark.examples.SparkYarn$.launchClient(SparkYarn.scala:55) at org.apache.spark.examples.SparkYarn$.main(SparkYarn.scala:83) at org.apache.spark.examples.SparkYarn.main(SparkYarn.scala) ... 6 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.security.authentication.util.KerberosUtil.getDefaultRealm(KerberosUtil.java:75) at org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:63) ... 11 more Caused by: KrbException: Cannot locate default realm -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Kerberos-setup-in-Apache-spark-connecting-to-remote-HDFS-Yarn-tp27181.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org