Can you share the exception(s) you encountered ?


> Hi,
> My modified code is listed below, just add the SecurityUtil API.  I don't 
> know which propertyKeys I should use, so I make 2 my own propertyKeys to find 
> the keytab and principal.
> object TestHBaseRead2 {
>  def main(args: Array[String]) {
>    val conf = new SparkConf()
>    val sc = new SparkContext(conf)
>    val hbConf = HBaseConfiguration.create()
>    hbConf.set("dhao.keytab.file","//etc//spark//keytab//spark.user.keytab")
>    hbConf.set("dhao.user.principal","")
>    SecurityUtil.login(hbConf,"dhao.keytab.file","dhao.user.principal")
>    val conn = ConnectionFactory.createConnection(hbConf)
>    val tbl = conn.getTable(TableName.valueOf("spark_t01"))
>    try {
>      val get = new Get(Bytes.toBytes("row01"))
>      val res = tbl.get(get)
>      println("result:"+res.toString)
>    }
>    finally {
>      tbl.close()
>      conn.close()
>      es.shutdown()
>    }
>    val rdd = sc.parallelize(Array(1,2,3,4,5,6,7,8,9,10))
>    val v = rdd.sum()
>    println("Value="+v)
>    sc.stop()
>  }
> }
> Can you post the morning modified code ?
> Thanks
>> On May 21, 2015, at 11:11 PM, donhoff_h <> wrote:
>> Hi,
>> Thanks very much for the reply.  I have tried the "SecurityUtil". I can see 
>> from log that this statement executed successfully, but I still can not pass 
>> the authentication of HBase. And with more experiments, I found a new 
>> interesting senario. If I run the program with yarn-client mode, the driver 
>> can pass the authentication, but the executors can not. If I run the program 
>> with yarn-cluster mode, both the driver and the executors can not pass the 
>> authentication.  Can anybody give me some clue with this info? Many Thanks!
>> Are the worker nodes colocated with HBase region servers ?
>> Were you running as hbase super user ?
>> You may need to login, using code similar to the following:
>>       if (isSecurityEnabled()) {
>>         SecurityUtil.login(conf, fileConfKey, principalConfKey, localhost);
>>       }
>> SecurityUtil is hadoop class.
>> Cheers
>>> On Thu, May 21, 2015 at 1:58 AM, donhoff_h <> wrote:
>>> Hi,
>>> Many thanks for the help. My Spark version is 1.3.0 too and I run it on 
>>> Yarn. According to your advice I have changed the configuration. Now my 
>>> program can read the hbase-site.xml correctly. And it can also authenticate 
>>> with zookeeper successfully. 
>>> But I meet a new problem that is my program still can not pass the 
>>> authentication of HBase. Did you or anybody else ever meet such kind of 
>>> situation ?  I used a keytab file to provide the principal. Since it can 
>>> pass the authentication of the Zookeeper, I am sure the keytab file is OK. 
>>> But it jsut can not pass the authentication of HBase. The exception is 
>>> listed below and could you or anybody else help me ? Still many many thanks!
>>> ****************************Exception***************************
>>> 15/05/21 16:03:18 INFO zookeeper.ZooKeeper: Initiating client connection, 
>>> sessionTimeout=90000 watcher=hconnection-0x4e142a710x0, 
>>> baseZNode=/hbase
>>> 15/05/21 16:03:18 INFO zookeeper.Login: successfully logged in.
>>> 15/05/21 16:03:18 INFO zookeeper.Login: TGT refresh thread started.
>>> 15/05/21 16:03:18 INFO client.ZooKeeperSaslClient: Client will use GSSAPI 
>>> as SASL mechanism.
>>> 15/05/21 16:03:18 INFO zookeeper.ClientCnxn: Opening socket connection to 
>>> server Will attempt to SASL-authenticate 
>>> using Login Context section 'Client'
>>> 15/05/21 16:03:18 INFO zookeeper.ClientCnxn: Socket connection established 
>>> to, initiating session
>>> 15/05/21 16:03:18 INFO zookeeper.Login: TGT valid starting at:        Thu 
>>> May 21 16:03:18 CST 2015
>>> 15/05/21 16:03:18 INFO zookeeper.Login: TGT expires:                  Fri 
>>> May 22 16:03:18 CST 2015
>>> 15/05/21 16:03:18 INFO zookeeper.Login: TGT refresh sleeping until: Fri May 
>>> 22 11:43:32 CST 2015
>>> 15/05/21 16:03:18 INFO zookeeper.ClientCnxn: Session establishment complete 
>>> on server, sessionid = 0x24d46cb0ffd0020, 
>>> negotiated timeout = 40000
>>> 15/05/21 16:03:18 WARN mapreduce.TableInputFormatBase: initializeTable 
>>> called multiple times. Overwriting connection and table reference; 
>>> TableInputFormatBase will not close these old references when done.
>>> 15/05/21 16:03:19 INFO util.RegionSizeCalculator: Calculating region sizes 
>>> for table "ns_dev1:hd01".
>>> 15/05/21 16:03:19 WARN ipc.AbstractRpcClient: Exception encountered while 
>>> connecting to the server : GSS initiate 
>>> failed [Caused by GSSException: No valid credentials provided (Mechanism 
>>> level: Failed to find any Kerberos tgt)]
>>> 15/05/21 16:03:19 ERROR ipc.AbstractRpcClient: SASL authentication failed. 
>>> The most likely cause is missing or invalid credentials. Consider 'kinit'.
>>> GSS initiate failed [Caused by 
>>> GSSException: No valid credentials provided (Mechanism level: Failed to 
>>> find any Kerberos tgt)]
>>>                 at 
>>>                 at 
>>>                 at 
>>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(
>>>                 at 
>>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$600(
>>>                 at 
>>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$
>>>                 at 
>>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$
>>>                 at 
>>> Method)
>>>                 at
>>>                 at 
>>>                 at 
>>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(
>>>                 at 
>>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(
>>>                 at 
>>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(
>>>                 at 
>>>                 at 
>>> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(
>>>                 at 
>>> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(
>>>                 at 
>>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(
>>>                 at 
>>> org.apache.hadoop.hbase.client.ScannerCallable.openScanner(
>>>                 at 
>>>                 at 
>>>                 at 
>>> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(
>>>                 at 
>>> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$
>>>                 at 
>>> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$
>>>                 at
>>>                 at 
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(
>>>                 at 
>>> java.util.concurrent.ThreadPoolExecutor$
>>>                 at
>>> ***********************I aslo list my codes as below if someone can give me 
>>> some advice from it*************************
>>> object TestHBaseRead {
>>>  def main(args: Array[String]) {
>>>    val conf = new SparkConf()
>>>    val sc = new SparkContext(conf)
>>>    val hbConf = HBaseConfiguration.create(sc.hadoopConfiguration)
>>>    val tbName = if(args.length==1) args(0) else "ns_dev1:hd01"
>>>    hbConf.set(TableInputFormat.INPUT_TABLE,tbName)
>>>    //I print the content of hbConf to check if it read the correct 
>>> hbase-site.xml
>>>    val it = hbConf.iterator()
>>>    while(it.hasNext) {
>>>      val e =
>>>      println("Key="+ e.getKey +" Value="+e.getValue)
>>>    }
>>>    val rdd = 
>>> sc.newAPIHadoopRDD(hbConf,classOf[TableInputFormat],classOf[ImmutableBytesWritable],classOf[Result])
>>>    rdd.foreach(x=>{
>>>      val key = x._1.toString
>>>      val it = x._2.listCells().iterator()
>>>     while(it.hasNext) {
>>>       val c =
>>>        val family = Bytes.toString(CellUtil.cloneFamily(c))
>>>        val qualifier = Bytes.toString(CellUtil.cloneQualifier(c))
>>>        val value = Bytes.toString(CellUtil.cloneValue(c))
>>>        val tm = c.getTimestamp
>>>        println("Key="+key+" Family="+family+" Qualifier="+qualifier+" 
>>> Value="+value+" TimeStamp="+tm)
>>>      }
>>>    })
>>>    sc.stop()
>>>  }
>>> }
>>> ***************************I used the following command to run my 
>>> program**********************
>>> spark-submit --class --master 
>>> yarn-cluster --driver-java-options 
>>> " 
>>>" --conf 
>>> spark.executor.extraJavaOptions="
>>>" /home/spark/myApps/TestHBase.jar
>>> I have similar problem that I cannot pass the HBase configuration file as 
>>> extra classpath to Spark any more using 
>>> spark.executor.extraClassPath=MY_HBASE_CONF_DIR in the Spark 1.3. We used 
>>> to run this in 1.2 without any problem.
>>>> On Tuesday, May 19, 2015, donhoff_h <> wrote:
>>>> Sorry, this ref does not help me.  I have set up the configuration in 
>>>> hbase-site.xml. But it seems there are still some extra configurations to 
>>>> be set or APIs to be called to make my spark program be able to pass the 
>>>> authentication with the HBase.
>>>> Does anybody know how to set authentication to a secured HBase in a spark 
>>>> program which use the API "newAPIHadoopRDD" to get information from HBase?
>>>> Many Thanks!
>>>> Please take a look at:
>>>> Cheers
>>>>> On Tue, May 19, 2015 at 5:23 AM, donhoff_h <> wrote:
>>>>> The principal is It is the user that I used to run my 
>>>>> spark programs. I am sure I have run the kinit command to make it take 
>>>>> effect. And I also used the HBase Shell to verify that this user has the 
>>>>> right to scan and put the tables in HBase.
>>>>> Now I still have no idea how to solve this problem. Can anybody help me 
>>>>> to figure it out? Many Thanks!
>>>>> Which user did you run your program as ?
>>>>> Have you granted proper permission on hbase side ?
>>>>> You should also check master log to see if there was some clue.
>>>>> Cheers
>>>>>> On May 19, 2015, at 2:41 AM, donhoff_h <> wrote:
>>>>>> Hi, experts.
>>>>>> I ran the "HBaseTest" program which is an example from the Apache Spark 
>>>>>> source code to learn how to use spark to access HBase. But I met the 
>>>>>> following exception:
>>>>>> Exception in thread "main" 
>>>>>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
>>>>>> attempts=36, exceptions:
>>>>>> Tue May 19 16:59:11 CST 2015, null, 
>>>>>> callTimeout=60000, callDuration=68648: row 'spark_t01,,00000000000000' 
>>>>>> on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
>>>>>>,16020,1431412877700, seqNum=0
>>>>>> I also checked the RegionServer Log of the host "" listed 
>>>>>> in the above exception. I found a few entries like the following one:
>>>>>> 2015-05-19 16:59:11,143 DEBUG 
>>>>>> [RpcServer.reader=2,,port=16020] 
>>>>>> ipc.RpcServer: RpcServer.listener,port=16020: Caught exception while 
>>>>>> reading:Authentication is required 
>>>>>> The above entry did not point to my program clearly. But the time is 
>>>>>> very near. Since my hbase version is HBase1.0.0 and I set security 
>>>>>> enabled, I doubt the exception was caused by the Kerberos 
>>>>>> authentication.  But I am not sure.
>>>>>> Do anybody know if my guess is right? And if I am right, could anybody 
>>>>>> tell me how to set Kerberos Authentication in a spark program? I don't 
>>>>>> know how to do it. I already checked the API doc , but did not found any 
>>>>>> API useful. Many Thanks!
>>>>>> By the way, my spark version is 1.3.0. I also paste the code of 
>>>>>> "HBaseTest" in the following:
>>>>>> ***************************Source Code******************************
>>>>>> object HBaseTest {
>>>>>>   def main(args: Array[String]) {
>>>>>>     val sparkConf = new SparkConf().setAppName("HBaseTest")
>>>>>>     val sc = new SparkContext(sparkConf)
>>>>>>     val conf = HBaseConfiguration.create()
>>>>>>     conf.set(TableInputFormat.INPUT_TABLE, args(0))
>>>>>>     // Initialize hBase table if necessary
>>>>>>     val admin = new HBaseAdmin(conf)
>>>>>>     if (!admin.isTableAvailable(args(0))) {
>>>>>>       val tableDesc = new HTableDescriptor(args(0))
>>>>>>       admin.createTable(tableDesc)
>>>>>>     }
>>>>>>     val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat],
>>>>>>       classOf[],
>>>>>>       classOf[org.apache.hadoop.hbase.client.Result])
>>>>>>     hBaseRDD.count()
>>>>>>     sc.stop()
>>>>>>   }
>>>>>> }
