Hi,

My modified code is listed below, just add the SecurityUtil API.  I don't know 
which propertyKeys I should use, so I make 2 my own propertyKeys to find the 
keytab and principal.

object TestHBaseRead2 {
 def main(args: Array[String]) {

   val conf = new SparkConf()
   val sc = new SparkContext(conf)
   val hbConf = HBaseConfiguration.create()
   hbConf.set("dhao.keytab.file","//etc//spark//keytab//spark.user.keytab")
   hbConf.set("dhao.user.principal","sp...@bgdt.dev.hrb")
   SecurityUtil.login(hbConf,"dhao.keytab.file","dhao.user.principal")
   val conn = ConnectionFactory.createConnection(hbConf)
   val tbl = conn.getTable(TableName.valueOf("spark_t01"))
   try {
     val get = new Get(Bytes.toBytes("row01"))
     val res = tbl.get(get)
     println("result:"+res.toString)
   }
   finally {
     tbl.close()
     conn.close()
     es.shutdown()
   }

   val rdd = sc.parallelize(Array(1,2,3,4,5,6,7,8,9,10))
   val v = rdd.sum()
   println("Value="+v)
   sc.stop()

 }
}




------------------ ???????? ------------------
??????: "yuzhihong";<yuzhih...@gmail.com>;
????????: 2015??5??22??(??????) ????3:25
??????: "donhoff_h"<165612...@qq.com>; 
????: "Bill Q"<bill.q....@gmail.com>; "user"<user@spark.apache.org>; 
????: Re: ?????? How to use spark to access HBase with Security enabled



Can you post the morning modified code ?


Thanks




On May 21, 2015, at 11:11 PM, donhoff_h <165612...@qq.com> wrote:


Hi,

Thanks very much for the reply.  I have tried the "SecurityUtil". I can see 
from log that this statement executed successfully, but I still can not pass 
the authentication of HBase. And with more experiments, I found a new 
interesting senario. If I run the program with yarn-client mode, the driver can 
pass the authentication, but the executors can not. If I run the program with 
yarn-cluster mode, both the driver and the executors can not pass the 
authentication.  Can anybody give me some clue with this info? Many Thanks!




------------------ ???????? ------------------
??????: "yuzhihong";<yuzhih...@gmail.com>;
????????: 2015??5??22??(??????) ????5:29
??????: "donhoff_h"<165612...@qq.com>; 
????: "Bill Q"<bill.q....@gmail.com>; "user"<user@spark.apache.org>; 
????: Re: How to use spark to access HBase with Security enabled



Are the worker nodes colocated with HBase region servers ?

Were you running as hbase super user ?


You may need to login, using code similar to the following:
 
      if (isSecurityEnabled()) {
 
        SecurityUtil.login(conf, fileConfKey, principalConfKey, localhost);
 
      }

 

SecurityUtil is hadoop class.




Cheers



On Thu, May 21, 2015 at 1:58 AM, donhoff_h <165612...@qq.com> wrote:
Hi,

Many thanks for the help. My Spark version is 1.3.0 too and I run it on Yarn. 
According to your advice I have changed the configuration. Now my program can 
read the hbase-site.xml correctly. And it can also authenticate with zookeeper 
successfully. 

But I meet a new problem that is my program still can not pass the 
authentication of HBase. Did you or anybody else ever meet such kind of 
situation ?   I used a keytab file to provide the principal. Since it can pass 
the authentication of the Zookeeper, I am sure the keytab file is OK. But it 
jsut can not pass the authentication of HBase. The exception is listed below 
and could you or anybody else help me ? Still many many thanks!

****************************Exception***************************
15/05/21 16:03:18 INFO zookeeper.ZooKeeper: Initiating client connection, 
connectString=bgdt02.dev.hrb:2181,bgdt01.dev.hrb:2181,bgdt03.dev.hrb:2181 
sessionTimeout=90000 watcher=hconnection-0x4e142a710x0, 
quorum=bgdt02.dev.hrb:2181,bgdt01.dev.hrb:2181,bgdt03.dev.hrb:2181, 
baseZNode=/hbase
15/05/21 16:03:18 INFO zookeeper.Login: successfully logged in.
15/05/21 16:03:18 INFO zookeeper.Login: TGT refresh thread started.
15/05/21 16:03:18 INFO client.ZooKeeperSaslClient: Client will use GSSAPI as 
SASL mechanism.
15/05/21 16:03:18 INFO zookeeper.ClientCnxn: Opening socket connection to 
server bgdt02.dev.hrb/130.1.9.98:2181. Will attempt to SASL-authenticate using 
Login Context section 'Client'
15/05/21 16:03:18 INFO zookeeper.ClientCnxn: Socket connection established to 
bgdt02.dev.hrb/130.1.9.98:2181, initiating session
15/05/21 16:03:18 INFO zookeeper.Login: TGT valid starting at:        Thu May 
21 16:03:18 CST 2015
15/05/21 16:03:18 INFO zookeeper.Login: TGT expires:                  Fri May 
22 16:03:18 CST 2015
15/05/21 16:03:18 INFO zookeeper.Login: TGT refresh sleeping until: Fri May 22 
11:43:32 CST 2015
15/05/21 16:03:18 INFO zookeeper.ClientCnxn: Session establishment complete on 
server bgdt02.dev.hrb/130.1.9.98:2181, sessionid = 0x24d46cb0ffd0020, 
negotiated timeout = 40000
15/05/21 16:03:18 WARN mapreduce.TableInputFormatBase: initializeTable called 
multiple times. Overwriting connection and table reference; 
TableInputFormatBase will not close these old references when done.
15/05/21 16:03:19 INFO util.RegionSizeCalculator: Calculating region sizes for 
table "ns_dev1:hd01".
15/05/21 16:03:19 WARN ipc.AbstractRpcClient: Exception encountered while 
connecting to the server : javax.security.sasl.SaslException: GSS initiate 
failed [Caused by GSSException: No valid credentials provided (Mechanism level: 
Failed to find any Kerberos tgt)]
15/05/21 16:03:19 ERROR ipc.AbstractRpcClient: SASL authentication failed. The 
most likely cause is missing or invalid credentials. Consider 'kinit'.
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]
                at 
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
                at 
org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179)
                at 
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:604)
                at 
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$600(RpcClientImpl.java:153)
                at 
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:730)
                at 
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:727)
                at java.security.AccessController.doPrivileged(Native Method)
                at javax.security.auth.Subject.doAs(Subject.java:415)
                at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
                at 
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:727)
                at 
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:880)
                at 
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:849)
                at 
org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1173)
                at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216)
                at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:300)
                at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:31751)
                at 
org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:332)
                at 
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:187)
                at 
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:62)
                at 
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
                at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:294)
                at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:275)
                at java.util.concurrent.FutureTask.run(FutureTask.java:262)
                at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
                at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
                at java.lang.Thread.run(Thread.java:745)

***********************I aslo list my codes as below if someone can give me 
some advice from it*************************
object TestHBaseRead {
 def main(args: Array[String]) {
   val conf = new SparkConf()
   val sc = new SparkContext(conf)
   val hbConf = HBaseConfiguration.create(sc.hadoopConfiguration)
   val tbName = if(args.length==1) args(0) else "ns_dev1:hd01"
   hbConf.set(TableInputFormat.INPUT_TABLE,tbName)
   //I print the content of hbConf to check if it read the correct 
hbase-site.xml
   val it = hbConf.iterator()
   while(it.hasNext) {
     val e = it.next()
     println("Key="+ e.getKey +" Value="+e.getValue)
   }

   val rdd = 
sc.newAPIHadoopRDD(hbConf,classOf[TableInputFormat],classOf[ImmutableBytesWritable],classOf[Result])
   rdd.foreach(x=>{
     val key = x._1.toString
     val it = x._2.listCells().iterator()
    while(it.hasNext) {
      val c = it.next()
       val family = Bytes.toString(CellUtil.cloneFamily(c))
       val qualifier = Bytes.toString(CellUtil.cloneQualifier(c))
       val value = Bytes.toString(CellUtil.cloneValue(c))
       val tm = c.getTimestamp
       println("Key="+key+" Family="+family+" Qualifier="+qualifier+" 
Value="+value+" TimeStamp="+tm)
     }
   })
   sc.stop()
 }
}

***************************I used the following command to run my 
program**********************
spark-submit --class dhao.test.read.singleTable.TestHBaseRead --master 
yarn-cluster --driver-java-options 
"-Djava.security.auth.login.config=/home/spark/spark-hbase.jaas 
-Djava.security.krb5.conf=/etc/krb5.conf" --conf 
spark.executor.extraJavaOptions="-Djava.security.auth.login.config=/home/spark/spark-hbase.jaas
 -Djava.security.krb5.conf=/etc/krb5.conf" /home/spark/myApps/TestHBase.jar



------------------ ???????? ------------------
??????: "Bill Q";<bill.q....@gmail.com>;
????????: 2015??5??20??(??????) ????10:13
??????: "donhoff_h"<165612...@qq.com>; 
????: "yuzhihong"<yuzhih...@gmail.com>; "user"<user@spark.apache.org>; 
????: Re: How to use spark to access HBase with Security enabled





I have similar problem that I cannot pass the HBase configuration file as extra 
classpath to Spark any more using 
spark.executor.extraClassPath=MY_HBASE_CONF_DIR in the Spark 1.3. We used to 
run this in 1.2 without any problem.

On Tuesday, May 19, 2015, donhoff_h <165612...@qq.com> wrote:

Sorry, this ref does not help me.  I have set up the configuration in 
hbase-site.xml. But it seems there are still some extra configurations to be 
set or APIs to be called to make my spark program be able to pass the 
authentication with the HBase.

Does anybody know how to set authentication to a secured HBase in a spark 
program which use the API "newAPIHadoopRDD" to get information from HBase?

Many Thanks!



------------------ ???????? ------------------
??????: "yuzhihong";<yuzhih...@gmail.com>;
????????: 2015??5??19??(??????) ????9:54
??????: "donhoff_h"<165612...@qq.com>; 
????: "user"<user@spark.apache.org>; 
????: Re: How to use spark to access HBase with Security enabled



Please take a look at:
http://hbase.apache.org/book.html#_client_side_configuration_for_secure_operation



Cheers


On Tue, May 19, 2015 at 5:23 AM, donhoff_h <165612...@qq.com> wrote:


The principal is sp...@bgdt.dev.hrb. It is the user that I used to run my spark 
programs. I am sure I have run the kinit command to make it take effect. And I 
also used the HBase Shell to verify that this user has the right to scan and 
put the tables in HBase.


Now I still have no idea how to solve this problem. Can anybody help me to 
figure it out? Many Thanks!


------------------ ???????? ------------------
??????: "yuzhihong";<yuzhih...@gmail.com>;
????????: 2015??5??19??(??????) ????7:55
??????: "donhoff_h"<165612...@qq.com>; 
????: "user"<user@spark.apache.org>; 
????: Re: How to use spark to access HBase with Security enabled



Which user did you run your program as ?


Have you granted proper permission on hbase side ?


You should also check master log to see if there was some clue. 


Cheers




On May 19, 2015, at 2:41 AM, donhoff_h <165612...@qq.com> wrote:


Hi, experts.

I ran the "HBaseTest" program which is an example from the Apache Spark source 
code to learn how to use spark to access HBase. But I met the following 
exception:
Exception in thread "main" 
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
attempts=36, exceptions:
Tue May 19 16:59:11 CST 2015, null, java.net.SocketTimeoutException: 
callTimeout=60000, callDuration=68648: row 'spark_t01,,00000000000000' on table 
'hbase:meta' at region=hbase:meta,,1.1588230740, 
hostname=bgdt01.dev.hrb,16020,1431412877700, seqNum=0

I also checked the RegionServer Log of the host "bgdt01.dev.hrb" listed in the 
above exception. I found a few entries like the following one:
2015-05-19 16:59:11,143 DEBUG 
[RpcServer.reader=2,bindAddress=bgdt01.dev.hrb,port=16020] ipc.RpcServer: 
RpcServer.listener,port=16020: Caught exception while reading:Authentication is 
required 

The above entry did not point to my program clearly. But the time is very near. 
Since my hbase version is HBase1.0.0 and I set security enabled, I doubt the 
exception was caused by the Kerberos authentication.  But I am not sure.

Do anybody know if my guess is right? And if I am right, could anybody tell me 
how to set Kerberos Authentication in a spark program? I don't know how to do 
it. I already checked the API doc , but did not found any API useful. Many 
Thanks!

By the way, my spark version is 1.3.0. I also paste the code of "HBaseTest" in 
the following:
***************************Source Code******************************
object HBaseTest {
  def main(args: Array[String]) {
    val sparkConf = new SparkConf().setAppName("HBaseTest")
    val sc = new SparkContext(sparkConf)
    val conf = HBaseConfiguration.create()
    conf.set(TableInputFormat.INPUT_TABLE, args(0))

    // Initialize hBase table if necessary
    val admin = new HBaseAdmin(conf)
    if (!admin.isTableAvailable(args(0))) {
      val tableDesc = new HTableDescriptor(args(0))
      admin.createTable(tableDesc)
    }


    val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat],
      classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable],
      classOf[org.apache.hadoop.hbase.client.Result])


    hBaseRDD.count()


    sc.stop()
  }
}












-- 
Many thanks.



Bill

Reply via email to