Re: How to use spark to access HBase with Security enabled

2015-05-22 Thread Frank Staszak
You might also enable debug in:
# Extra Java runtime options.  Empty by default.
and check that the principals are the same on the NameNode and DataNode.
and you can confirm the same on all nodes in hdfs-site.xml.
You can also ensure all nodes in the cluster are kerberized in core-site.xml 
(no auth by default) :   
Set the authentication for the cluster. Valid values are: 
simple or kerberos.

Best Regards

> On May 22, 2015, at 4:25 AM, Ted Yu  wrote:
> Can you share the exception(s) you encountered ?
> Thanks
> On May 22, 2015, at 12:33 AM, donhoff_h <> wrote:
>> Hi,
>> My modified code is listed below, just add the SecurityUtil API.  I don't 
>> know which propertyKeys I should use, so I make 2 my own propertyKeys to 
>> find the keytab and principal.
>> object TestHBaseRead2 {
>>  def main(args: Array[String]) {
>>val conf = new SparkConf()
>>val sc = new SparkContext(conf)
>>val hbConf = HBaseConfiguration.create()
>>val conn = ConnectionFactory.createConnection(hbConf)
>>val tbl = conn.getTable(TableName.valueOf("spark_t01"))
>>try {
>>  val get = new Get(Bytes.toBytes("row01"))
>>  val res = tbl.get(get)
>>  println("result:"+res.toString)
>>finally {
>>  tbl.close()
>>  conn.close()
>>  es.shutdown()
>>val rdd = sc.parallelize(Array(1,2,3,4,5,6,7,8,9,10))
>>val v = rdd.sum()
>>  }
>> }
>> -- 原始邮件 --
>> 发件人: "yuzhihong";;
>> 发送时间: 2015年5月22日(星期五) 下午3:25
>> 收件人: "donhoff_h"<>;
>> 抄送: "Bill Q"; "user";
>> 主题: Re: 回复: How to use spark to access HBase with Security enabled
>> Can you post the morning modified code ?
>> Thanks
>> On May 21, 2015, at 11:11 PM, donhoff_h <> wrote:
>>> Hi,
>>> Thanks very much for the reply.  I have tried the "SecurityUtil". I can see 
>>> from log that this statement executed successfully, but I still can not 
>>> pass the authentication of HBase. And with more experiments, I found a new 
>>> interesting senario. If I run the program with yarn-client mode, the driver 
>>> can pass the authentication, but the executors can not. If I run the 
>>> program with yarn-cluster mode, both the driver and the executors can not 
>>> pass the authentication.  Can anybody give me some clue with this info? 
>>> Many Thanks!
>>> -- 原始邮件 --
>>> 发件人: "yuzhihong";;
>>> 发送时间: 2015年5月22日(星期五) 凌晨5:29
>>> 收件人: "donhoff_h"<>;
>>> 抄送: "Bill Q"; "user";
>>> 主题: Re: How to use spark to access HBase with Security enabled
>>> Are the worker nodes colocated with HBase region servers ?
>>> Were you running as hbase super user ?
>>> You may need to login, using code similar to the following:
>>>   if (isSecurityEnabled()) {
>>> SecurityUtil.login(conf, fileConfKey, principalConfKey, localhost);
>>>   }
>>> SecurityUtil is hadoop class.
>>> Cheers
>>> On Thu, May 21, 2015 at 1:58 AM, donhoff_h <> wrote:
>>> Hi,
>>> Many thanks for the help. My Spark version is 1.3.0 too and I run it on 
>>> Yarn. According to your advice I have changed the configuration. Now my 
>>> program can read the hbase-site.xml correctly. And it can also authenticate 
>>> with zookeeper successfully. 
>>> But I meet

Re: How to use spark to access HBase with Security enabled

2015-05-21 Thread Bill Q
>> at
>> at
>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(
>> at
>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(
>> at
>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(
>> at
>> at
>> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(
>> at
>> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(
>> at
>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(
>> at
>> org.apache.hadoop.hbase.client.ScannerCallable.openScanner(
>> at
>> at
>> at
>> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(
>> at
>> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$
>> at
>> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$
>> at
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(
>> at
>> java.util.concurrent.ThreadPoolExecutor$
>> at
>> ***I aslo list my codes as below if someone can give
>> me some advice from it*
>> object TestHBaseRead {
>>  def main(args: Array[String]) {
>>val conf = new SparkConf()
>>val sc = new SparkContext(conf)
>>val hbConf = HBaseConfiguration.create(sc.hadoopConfiguration)
>>val tbName = if(args.length==1) args(0) else "ns_dev1:hd01"
>>//I print the content of hbConf to check if it read the correct
>> hbase-site.xml
>>val it = hbConf.iterator()
>>while(it.hasNext) {
>>  val e =
>>  println("Key="+ e.getKey +" Value="+e.getValue)
>>val rdd =
>> sc.newAPIHadoopRDD(hbConf,classOf[TableInputFormat],classOf[ImmutableBytesWritable],classOf[Result])
>>  val key = x._1.toString
>>  val it = x._2.listCells().iterator()
>> while(it.hasNext) {
>>   val c =
>>val family = Bytes.toString(CellUtil.cloneFamily(c))
>>val qualifier = Bytes.toString(CellUtil.cloneQualifier(c))
>>val value = Bytes.toString(CellUtil.cloneValue(c))
>>    val tm = c.getTimestamp
>>    println("Key="+key+" Family="+family+" Qualifier="+qualifier+"
>> Value="+value+" TimeStamp="+tm)
>>  }
>>  }
>> }
>> ***I used the following command to run my
>> program**
>> spark-submit --class --master
>> yarn-cluster --driver-java-options
>> "
>>" --conf
>> spark.executor.extraJavaOptions="
>>" /home/spark/myApps/TestHBase.jar
>> -- 原始邮件 --
>> *发件人:* "Bill Q";> >;
>> *发送时间:* 2015年5月20日(星期三) 晚上10:13
>> *收件人:* "donhoff_h"<
>> >;
>> *抄送:* "yuzhihong"> >; "user"<
>> >;
>> *主题:* Re: How to use spark to access HBase with Security enabled
>> I have similar problem that I cannot pass the HBase configura

Re: How to use spark to access HBase with Security enabled

2015-05-21 Thread Ted Yu
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(
> at
> org.apache.hadoop.hbase.client.ScannerCallable.openScanner(
> at
> at
> at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(
> at
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$
> at
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$
> at
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(
> at
> java.util.concurrent.ThreadPoolExecutor$
> at
> ***I aslo list my codes as below if someone can give
> me some advice from it*
> object TestHBaseRead {
>  def main(args: Array[String]) {
>val conf = new SparkConf()
>val sc = new SparkContext(conf)
>val hbConf = HBaseConfiguration.create(sc.hadoopConfiguration)
>val tbName = if(args.length==1) args(0) else "ns_dev1:hd01"
>//I print the content of hbConf to check if it read the correct
> hbase-site.xml
>val it = hbConf.iterator()
>while(it.hasNext) {
>  val e =
>  println("Key="+ e.getKey +" Value="+e.getValue)
>val rdd =
> sc.newAPIHadoopRDD(hbConf,classOf[TableInputFormat],classOf[ImmutableBytesWritable],classOf[Result])
>  val key = x._1.toString
>  val it = x._2.listCells().iterator()
> while(it.hasNext) {
>   val c =
>val family = Bytes.toString(CellUtil.cloneFamily(c))
>val qualifier = Bytes.toString(CellUtil.cloneQualifier(c))
>val value = Bytes.toString(CellUtil.cloneValue(c))
>val tm = c.getTimestamp
>println("Key="+key+" Family="+family+" Qualifier="+qualifier+"
> Value="+value+" TimeStamp="+tm)
>  }
>  }
> }
> ***I used the following command to run my
> program**
> spark-submit --class --master
> yarn-cluster --driver-java-options
> "
>" --conf
> spark.executor.extraJavaOptions="
>" /home/spark/myApps/TestHBase.jar
> -- 原始邮件 --
> *发件人:* "Bill Q";;
> *发送时间:* 2015年5月20日(星期三) 晚上10:13
> *收件人:* "donhoff_h"<>;
> *抄送:* "yuzhihong"; "user";
> *主题:* Re: How to use spark to access HBase with Security enabled
> I have similar problem that I cannot pass the HBase configuration file as
> extra classpath to Spark any more using
> spark.executor.extraClassPath=MY_HBASE_CONF_DIR in the Spark 1.3. We used
> to run this in 1.2 without any problem.
> On Tuesday, May 19, 2015, donhoff_h <> wrote:
>> Sorry, this ref does not help me.  I have set up the configuration in
>> hbase-site.xml. But it seems there are still some extra configurations to
>> be set or APIs to be called to make my spark program be able to pass the
>> authentication with the HBase.
>> Does anybody know how to set authentication to a secured HBase in a spark
>> program which use the API "newAPIHadoopRDD" to get information from HBase?
>> Many Thanks!
>> -- 原始邮件 --
>> *发件人:* "yuzhihong";;
>> *发送时间:* 2015年5月19日(星期二) 晚上9:54
>> *收件人:* "donhoff_h"<>;

Re: How to use spark to access HBase with Security enabled

2015-05-20 Thread Bill Q
I have similar problem that I cannot pass the HBase configuration file as
extra classpath to Spark any more using
spark.executor.extraClassPath=MY_HBASE_CONF_DIR in the Spark 1.3. We used
to run this in 1.2 without any problem.

On Tuesday, May 19, 2015, donhoff_h <> wrote:

> Sorry, this ref does not help me.  I have set up the configuration in
> hbase-site.xml. But it seems there are still some extra configurations to
> be set or APIs to be called to make my spark program be able to pass the
> authentication with the HBase.
> Does anybody know how to set authentication to a secured HBase in a spark
> program which use the API "newAPIHadoopRDD" to get information from HBase?
> Many Thanks!
> -- 原始邮件 --
> *发件人:* "yuzhihong"; >;
> *发送时间:* 2015年5月19日(星期二) 晚上9:54
> *收件人:* "donhoff_h"<
> >;
> *抄送:* "user" >;
> *主题:* Re: How to use spark to access HBase with Security enabled
> Please take a look at:
> Cheers
> On Tue, May 19, 2015 at 5:23 AM, donhoff_h <
> > wrote:
>> The principal is It is the user that I used to run
>> my spark programs. I am sure I have run the kinit command to make it take
>> effect. And I also used the HBase Shell to verify that this user has the
>> right to scan and put the tables in HBase.
>> Now I still have no idea how to solve this problem. Can anybody help me
>> to figure it out? Many Thanks!
>> ------ 原始邮件 --
>> *发件人:* "yuzhihong";> >;
>> *发送时间:* 2015年5月19日(星期二) 晚上7:55
>> *收件人:* "donhoff_h"<
>> >;
>> *抄送:* "user"> >;
>> *主题:* Re: How to use spark to access HBase with Security enabled
>> Which user did you run your program as ?
>> Have you granted proper permission on hbase side ?
>> You should also check master log to see if there was some clue.
>> Cheers
>> On May 19, 2015, at 2:41 AM, donhoff_h <
>> > wrote:
>> Hi, experts.
>> I ran the "HBaseTest" program which is an example from the Apache Spark
>> source code to learn how to use spark to access HBase. But I met the
>> following exception:
>> Exception in thread "main"
>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
>> attempts=36, exceptions:
>> Tue May 19 16:59:11 CST 2015, null,
>> callTimeout=6, callDuration=68648: row 'spark_t01,,00' on
>> table 'hbase:meta' at region=hbase:meta,,1.1588230740,
>>,16020,1431412877700, seqNum=0
>> I also checked the RegionServer Log of the host "" listed
>> in the above exception. I found a few entries like the following one:
>> 2015-05-19 16:59:11,143 DEBUG
>> [RpcServer.reader=2,,port=16020] ipc.RpcServer:
>> RpcServer.listener,port=16020: Caught exception while
>> reading:Authentication is required
>> The above entry did not point to my program clearly. But the time is very
>> near. Since my hbase version is HBase1.0.0 and I set security enabled, I
>> doubt the exception was caused by the Kerberos authentication.  But I am
>> not sure.
>> Do anybody know if my guess is right? And if I am right, could anybody
>> tell me how to set Kerberos Authentication in a spark program? I don't know
>> how to do it. I already checked the API doc , but did not found any API
>> useful. Many Thanks!
>> By the way, my spark version is 1.3.0. I also paste the code of
>> "HBaseTest" in the following:
>> ***Source Code**
>> object HBaseTest {
>>   def main(args: Array[String]) {
>> val sparkConf = new SparkConf().setAppName("HBaseTest")
>> val sc = new SparkContext(sparkConf)
>> val conf = HBaseConfiguration.create()
>> conf.set(TableInputFormat.INPUT_TABLE, args(0))
>> // Initialize hBase table if necessary
>> val admin = new HBaseAdmin(conf)
>> if (!admin.isTableAvailable(args(0))) {
>>   val tableDesc = new HTableDescriptor(args(0))
>>   admin.createTable(tableDesc)
>> }
>> val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat],
>>   classOf[],
>>   classOf[org.apache.hadoop.hbase.client.Result])
>> hBaseRDD.count()
>> sc.stop()
>>   }
>> }

Many thanks.


Re: How to use spark to access HBase with Security enabled

2015-05-19 Thread Ted Yu
Please take a look at:


On Tue, May 19, 2015 at 5:23 AM, donhoff_h <> wrote:

> The principal is It is the user that I used to run my
> spark programs. I am sure I have run the kinit command to make it take
> effect. And I also used the HBase Shell to verify that this user has the
> right to scan and put the tables in HBase.
> Now I still have no idea how to solve this problem. Can anybody help me to
> figure it out? Many Thanks!
> -- 原始邮件 --
> *发件人:* "yuzhihong";;
> *发送时间:* 2015年5月19日(星期二) 晚上7:55
> *收件人:* "donhoff_h"<>;
> *抄送:* "user";
> *主题:* Re: How to use spark to access HBase with Security enabled
> Which user did you run your program as ?
> Have you granted proper permission on hbase side ?
> You should also check master log to see if there was some clue.
> Cheers
> On May 19, 2015, at 2:41 AM, donhoff_h <> wrote:
> Hi, experts.
> I ran the "HBaseTest" program which is an example from the Apache Spark
> source code to learn how to use spark to access HBase. But I met the
> following exception:
> Exception in thread "main"
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
> attempts=36, exceptions:
> Tue May 19 16:59:11 CST 2015, null,
> callTimeout=6, callDuration=68648: row 'spark_t01,,00' on
> table 'hbase:meta' at region=hbase:meta,,1.1588230740,
>,16020,1431412877700, seqNum=0
> I also checked the RegionServer Log of the host "" listed in
> the above exception. I found a few entries like the following one:
> 2015-05-19 16:59:11,143 DEBUG
> [RpcServer.reader=2,,port=16020] ipc.RpcServer:
> RpcServer.listener,port=16020: Caught exception while
> reading:Authentication is required
> The above entry did not point to my program clearly. But the time is very
> near. Since my hbase version is HBase1.0.0 and I set security enabled, I
> doubt the exception was caused by the Kerberos authentication.  But I am
> not sure.
> Do anybody know if my guess is right? And if I am right, could anybody
> tell me how to set Kerberos Authentication in a spark program? I don't know
> how to do it. I already checked the API doc , but did not found any API
> useful. Many Thanks!
> By the way, my spark version is 1.3.0. I also paste the code of
> "HBaseTest" in the following:
> ***Source Code**
> object HBaseTest {
>   def main(args: Array[String]) {
> val sparkConf = new SparkConf().setAppName("HBaseTest")
> val sc = new SparkContext(sparkConf)
> val conf = HBaseConfiguration.create()
> conf.set(TableInputFormat.INPUT_TABLE, args(0))
> // Initialize hBase table if necessary
> val admin = new HBaseAdmin(conf)
> if (!admin.isTableAvailable(args(0))) {
>   val tableDesc = new HTableDescriptor(args(0))
>   admin.createTable(tableDesc)
> }
> val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat],
>   classOf[],
>   classOf[org.apache.hadoop.hbase.client.Result])
> hBaseRDD.count()
> sc.stop()
>   }
> }

Re: How to use spark to access HBase with Security enabled

2015-05-19 Thread Ted Yu
Which user did you run your program as ?

Have you granted proper permission on hbase side ?

You should also check master log to see if there was some clue. 


> On May 19, 2015, at 2:41 AM, donhoff_h <> wrote:
> Hi, experts.
> I ran the "HBaseTest" program which is an example from the Apache Spark 
> source code to learn how to use spark to access HBase. But I met the 
> following exception:
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
> attempts=36, exceptions:
> Tue May 19 16:59:11 CST 2015, null, 
> callTimeout=6, callDuration=68648: row 'spark_t01,,00' on 
> table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
>,16020,1431412877700, seqNum=0
> I also checked the RegionServer Log of the host "" listed in 
> the above exception. I found a few entries like the following one:
> 2015-05-19 16:59:11,143 DEBUG 
> [RpcServer.reader=2,,port=16020] ipc.RpcServer: 
> RpcServer.listener,port=16020: Caught exception while reading:Authentication 
> is required 
> The above entry did not point to my program clearly. But the time is very 
> near. Since my hbase version is HBase1.0.0 and I set security enabled, I 
> doubt the exception was caused by the Kerberos authentication.  But I am not 
> sure.
> Do anybody know if my guess is right? And if I am right, could anybody tell 
> me how to set Kerberos Authentication in a spark program? I don't know how to 
> do it. I already checked the API doc , but did not found any API useful. Many 
> Thanks!
> By the way, my spark version is 1.3.0. I also paste the code of "HBaseTest" 
> in the following:
> ***Source Code**
> object HBaseTest {
>   def main(args: Array[String]) {
> val sparkConf = new SparkConf().setAppName("HBaseTest")
> val sc = new SparkContext(sparkConf)
> val conf = HBaseConfiguration.create()
> conf.set(TableInputFormat.INPUT_TABLE, args(0))
> // Initialize hBase table if necessary
> val admin = new HBaseAdmin(conf)
> if (!admin.isTableAvailable(args(0))) {
>   val tableDesc = new HTableDescriptor(args(0))
>   admin.createTable(tableDesc)
> }
> val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat],
>   classOf[],
>   classOf[org.apache.hadoop.hbase.client.Result])
> hBaseRDD.count()
> sc.stop()
>   }
> }