Kerberos setup in Apache spark connecting to remote HDFS/Yarn

akhandeshi Thu, 16 Jun 2016 06:32:51 -0700

I am trying to setup my IDE to a scala spark application.  I want to access
HDFS files from remote Hadoop server that has Kerberos enabled.  My
understanding is I should be able to do that from Spark.  Here is my code so
far:


val sparkConf = new SparkConf().setAppName(appName).setMaster(master);
                
                if(jars.length>0) {
                        sparkConf.setJars(jars);
                }

                if(!properties.isEmpty) {
                        //val iter = properties.keys.iterator
                        for((k,v)<-properties)
                                sparkConf.set(k, v);
                } else {
            sparkConf
            .set("spark.executor.memory", "1024m")
            .set("spark.cores.max", "1")
            .set("spark.default.parallelism", "4");
                }

                try {
                        if(!StringUtils.isBlank(principal) && 
!StringUtils.isBlank(keytab)) {
                    //UserGroupInformation.setConfiguration(config);
                                
UserGroupInformation.loginUserFromKeytab(principal, keytab);
                        }
                } catch  {
                  case ioe:IOException =>{
                        println("Failed to login to Hadoop [principal = " + 
principal + ", keytab
= " + keytab + "]");
                        ioe.printStackTrace();}
                }
                 val sc = new SparkContext(sparkConf)
                   val MY_FILE: String = "hdfs://remoteserver:port/file.out"
   val rDD = sc.textFile(MY_FILE,10)
   println("Lines "+rDD.count);

I have core-site.xml in my classpath.  I changed hadoop.ssl.enabled to false
as it was expecting a secret key.  The principal I am using is correct.  I
tried username/_HOST@fully.qualified.domain and
username@fully.qualified.domain with no success.  I tried running spark in
local mode and yarn client mode.   I am hoping someone has a recipe/solved
this problem.  Any pointers to help setup/debug this problem will be
helpful.

I am getting following error message:

Caused by: java.lang.IllegalArgumentException: Can't get Kerberos realm
        at
org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:65)
        at
org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:227)
        at
org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:249)
        at org.apache.spark.examples.SparkYarn$.launchClient(SparkYarn.scala:55)
        at org.apache.spark.examples.SparkYarn$.main(SparkYarn.scala:83)
        at org.apache.spark.examples.SparkYarn.main(SparkYarn.scala)
        ... 6 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at
org.apache.hadoop.security.authentication.util.KerberosUtil.getDefaultRealm(KerberosUtil.java:75)
        at
org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:63)
        ... 11 more
Caused by: KrbException: Cannot locate default realm



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Kerberos-setup-in-Apache-spark-connecting-to-remote-HDFS-Yarn-tp27181.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Kerberos setup in Apache spark connecting to remote HDFS/Yarn

Reply via email to