Re: How to use spark to access HBase with Security enabled
You might also enable debug in: hadoop-env.sh # Extra Java runtime options. Empty by default. export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true -Dsun.security.krb5.debug=true ${HADOOP_OPTS}” and check that the principals are the same on the NameNode and DataNode. and you can confirm the same on all nodes in hdfs-site.xml. You can also ensure all nodes in the cluster are kerberized in core-site.xml (no auth by default) : hadoop.security.authentication kerberos Set the authentication for the cluster. Valid values are: simple or kerberos. https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SecureMode.html Best Regards Frank > On May 22, 2015, at 4:25 AM, Ted Yu wrote: > > Can you share the exception(s) you encountered ? > > Thanks > > > > On May 22, 2015, at 12:33 AM, donhoff_h <165612...@qq.com> wrote: > >> Hi, >> >> My modified code is listed below, just add the SecurityUtil API. I don't >> know which propertyKeys I should use, so I make 2 my own propertyKeys to >> find the keytab and principal. >> >> object TestHBaseRead2 { >> def main(args: Array[String]) { >> >>val conf = new SparkConf() >>val sc = new SparkContext(conf) >>val hbConf = HBaseConfiguration.create() >>hbConf.set("dhao.keytab.file","//etc//spark//keytab//spark.user.keytab") >>hbConf.set("dhao.user.principal","sp...@bgdt.dev.hrb") >>SecurityUtil.login(hbConf,"dhao.keytab.file","dhao.user.principal") >>val conn = ConnectionFactory.createConnection(hbConf) >>val tbl = conn.getTable(TableName.valueOf("spark_t01")) >>try { >> val get = new Get(Bytes.toBytes("row01")) >> val res = tbl.get(get) >> println("result:"+res.toString) >>} >>finally { >> tbl.close() >> conn.close() >> es.shutdown() >>} >> >>val rdd = sc.parallelize(Array(1,2,3,4,5,6,7,8,9,10)) >>val v = rdd.sum() >>println("Value="+v) >>sc.stop() >> >> } >> } >> >> >> -- 原始邮件 -- >> 发件人: "yuzhihong";; >> 发送时间: 2015年5月22日(星期五) 下午3:25 >> 收件人: "donhoff_h"<165612...@qq.com>; >> 抄送: "Bill Q"; "user"; >> 主题: Re: 回复: How to use spark to access HBase with Security enabled >> >> Can you post the morning modified code ? >> >> Thanks >> >> >> >> On May 21, 2015, at 11:11 PM, donhoff_h <165612...@qq.com> wrote: >> >>> Hi, >>> >>> Thanks very much for the reply. I have tried the "SecurityUtil". I can see >>> from log that this statement executed successfully, but I still can not >>> pass the authentication of HBase. And with more experiments, I found a new >>> interesting senario. If I run the program with yarn-client mode, the driver >>> can pass the authentication, but the executors can not. If I run the >>> program with yarn-cluster mode, both the driver and the executors can not >>> pass the authentication. Can anybody give me some clue with this info? >>> Many Thanks! >>> >>> >>> -- 原始邮件 -- >>> 发件人: "yuzhihong";; >>> 发送时间: 2015年5月22日(星期五) 凌晨5:29 >>> 收件人: "donhoff_h"<165612...@qq.com>; >>> 抄送: "Bill Q"; "user"; >>> 主题: Re: How to use spark to access HBase with Security enabled >>> >>> Are the worker nodes colocated with HBase region servers ? >>> >>> Were you running as hbase super user ? >>> >>> You may need to login, using code similar to the following: >>> if (isSecurityEnabled()) { >>> >>> SecurityUtil.login(conf, fileConfKey, principalConfKey, localhost); >>> >>> } >>> >>> >>> SecurityUtil is hadoop class. >>> >>> >>> >>> Cheers >>> >>> >>> On Thu, May 21, 2015 at 1:58 AM, donhoff_h <165612...@qq.com> wrote: >>> Hi, >>> >>> Many thanks for the help. My Spark version is 1.3.0 too and I run it on >>> Yarn. According to your advice I have changed the configuration. Now my >>> program can read the hbase-site.xml correctly. And it can also authenticate >>> with zookeeper successfully. >>> >>> But I meet
Re: How to use spark to access HBase with Security enabled
curity.auth.Subject.doAs(Subject.java:415) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) >> at >> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:727) >> at >> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:880) >> at >> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:849) >> at >> org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1173) >> at >> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216) >> at >> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:300) >> at >> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:31751) >> at >> org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:332) >> at >> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:187) >> at >> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:62) >> at >> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126) >> at >> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:294) >> at >> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:275) >> at >> java.util.concurrent.FutureTask.run(FutureTask.java:262) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> >> ***I aslo list my codes as below if someone can give >> me some advice from it* >> object TestHBaseRead { >> def main(args: Array[String]) { >>val conf = new SparkConf() >>val sc = new SparkContext(conf) >>val hbConf = HBaseConfiguration.create(sc.hadoopConfiguration) >>val tbName = if(args.length==1) args(0) else "ns_dev1:hd01" >>hbConf.set(TableInputFormat.INPUT_TABLE,tbName) >>//I print the content of hbConf to check if it read the correct >> hbase-site.xml >>val it = hbConf.iterator() >>while(it.hasNext) { >> val e = it.next() >> println("Key="+ e.getKey +" Value="+e.getValue) >>} >> >>val rdd = >> sc.newAPIHadoopRDD(hbConf,classOf[TableInputFormat],classOf[ImmutableBytesWritable],classOf[Result]) >>rdd.foreach(x=>{ >> val key = x._1.toString >> val it = x._2.listCells().iterator() >> while(it.hasNext) { >> val c = it.next() >>val family = Bytes.toString(CellUtil.cloneFamily(c)) >>val qualifier = Bytes.toString(CellUtil.cloneQualifier(c)) >>val value = Bytes.toString(CellUtil.cloneValue(c)) >> val tm = c.getTimestamp >> println("Key="+key+" Family="+family+" Qualifier="+qualifier+" >> Value="+value+" TimeStamp="+tm) >> } >>}) >>sc.stop() >> } >> } >> >> ***I used the following command to run my >> program** >> spark-submit --class dhao.test.read.singleTable.TestHBaseRead --master >> yarn-cluster --driver-java-options >> "-Djava.security.auth.login.config=/home/spark/spark-hbase.jaas >> -Djava.security.krb5.conf=/etc/krb5.conf" --conf >> spark.executor.extraJavaOptions="-Djava.security.auth.login.config=/home/spark/spark-hbase.jaas >> -Djava.security.krb5.conf=/etc/krb5.conf" /home/spark/myApps/TestHBase.jar >> >> -- 原始邮件 -- >> *发件人:* "Bill Q";> >; >> *发送时间:* 2015年5月20日(星期三) 晚上10:13 >> *收件人:* "donhoff_h"<165612...@qq.com >> >; >> *抄送:* "yuzhihong"> >; "user"< >> user@spark.apache.org >> >; >> *主题:* Re: How to use spark to access HBase with Security enabled >> >> I have similar problem that I cannot pass the HBase configura
Re: How to use spark to access HBase with Security enabled
at > org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1173) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:300) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:31751) > at > org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:332) > at > org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:187) > at > org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:62) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:294) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:275) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > > ***I aslo list my codes as below if someone can give > me some advice from it* > object TestHBaseRead { > def main(args: Array[String]) { >val conf = new SparkConf() >val sc = new SparkContext(conf) >val hbConf = HBaseConfiguration.create(sc.hadoopConfiguration) >val tbName = if(args.length==1) args(0) else "ns_dev1:hd01" >hbConf.set(TableInputFormat.INPUT_TABLE,tbName) >//I print the content of hbConf to check if it read the correct > hbase-site.xml >val it = hbConf.iterator() >while(it.hasNext) { > val e = it.next() > println("Key="+ e.getKey +" Value="+e.getValue) >} > >val rdd = > sc.newAPIHadoopRDD(hbConf,classOf[TableInputFormat],classOf[ImmutableBytesWritable],classOf[Result]) >rdd.foreach(x=>{ > val key = x._1.toString > val it = x._2.listCells().iterator() > while(it.hasNext) { > val c = it.next() >val family = Bytes.toString(CellUtil.cloneFamily(c)) >val qualifier = Bytes.toString(CellUtil.cloneQualifier(c)) >val value = Bytes.toString(CellUtil.cloneValue(c)) >val tm = c.getTimestamp >println("Key="+key+" Family="+family+" Qualifier="+qualifier+" > Value="+value+" TimeStamp="+tm) > } >}) >sc.stop() > } > } > > ***I used the following command to run my > program** > spark-submit --class dhao.test.read.singleTable.TestHBaseRead --master > yarn-cluster --driver-java-options > "-Djava.security.auth.login.config=/home/spark/spark-hbase.jaas > -Djava.security.krb5.conf=/etc/krb5.conf" --conf > spark.executor.extraJavaOptions="-Djava.security.auth.login.config=/home/spark/spark-hbase.jaas > -Djava.security.krb5.conf=/etc/krb5.conf" /home/spark/myApps/TestHBase.jar > > -- 原始邮件 -- > *发件人:* "Bill Q";; > *发送时间:* 2015年5月20日(星期三) 晚上10:13 > *收件人:* "donhoff_h"<165612...@qq.com>; > *抄送:* "yuzhihong"; "user"; > *主题:* Re: How to use spark to access HBase with Security enabled > > I have similar problem that I cannot pass the HBase configuration file as > extra classpath to Spark any more using > spark.executor.extraClassPath=MY_HBASE_CONF_DIR in the Spark 1.3. We used > to run this in 1.2 without any problem. > > On Tuesday, May 19, 2015, donhoff_h <165612...@qq.com> wrote: > >> >> Sorry, this ref does not help me. I have set up the configuration in >> hbase-site.xml. But it seems there are still some extra configurations to >> be set or APIs to be called to make my spark program be able to pass the >> authentication with the HBase. >> >> Does anybody know how to set authentication to a secured HBase in a spark >> program which use the API "newAPIHadoopRDD" to get information from HBase? >> >> Many Thanks! >> >> -- 原始邮件 -- >> *发件人:* "yuzhihong";; >> *发送时间:* 2015年5月19日(星期二) 晚上9:54 >> *收件人:* "donhoff_h"<165612...@qq.com>; >>
Re: How to use spark to access HBase with Security enabled
I have similar problem that I cannot pass the HBase configuration file as extra classpath to Spark any more using spark.executor.extraClassPath=MY_HBASE_CONF_DIR in the Spark 1.3. We used to run this in 1.2 without any problem. On Tuesday, May 19, 2015, donhoff_h <165612...@qq.com> wrote: > > Sorry, this ref does not help me. I have set up the configuration in > hbase-site.xml. But it seems there are still some extra configurations to > be set or APIs to be called to make my spark program be able to pass the > authentication with the HBase. > > Does anybody know how to set authentication to a secured HBase in a spark > program which use the API "newAPIHadoopRDD" to get information from HBase? > > Many Thanks! > > -- 原始邮件 -- > *发件人:* "yuzhihong"; >; > *发送时间:* 2015年5月19日(星期二) 晚上9:54 > *收件人:* "donhoff_h"<165612...@qq.com > >; > *抄送:* "user" >; > *主题:* Re: How to use spark to access HBase with Security enabled > > Please take a look at: > > http://hbase.apache.org/book.html#_client_side_configuration_for_secure_operation > > Cheers > > On Tue, May 19, 2015 at 5:23 AM, donhoff_h <165612...@qq.com > > wrote: > >> >> The principal is sp...@bgdt.dev.hrb. It is the user that I used to run >> my spark programs. I am sure I have run the kinit command to make it take >> effect. And I also used the HBase Shell to verify that this user has the >> right to scan and put the tables in HBase. >> >> Now I still have no idea how to solve this problem. Can anybody help me >> to figure it out? Many Thanks! >> >> ------ 原始邮件 -- >> *发件人:* "yuzhihong";> >; >> *发送时间:* 2015年5月19日(星期二) 晚上7:55 >> *收件人:* "donhoff_h"<165612...@qq.com >> >; >> *抄送:* "user"> >; >> *主题:* Re: How to use spark to access HBase with Security enabled >> >> Which user did you run your program as ? >> >> Have you granted proper permission on hbase side ? >> >> You should also check master log to see if there was some clue. >> >> Cheers >> >> >> >> On May 19, 2015, at 2:41 AM, donhoff_h <165612...@qq.com >> > wrote: >> >> Hi, experts. >> >> I ran the "HBaseTest" program which is an example from the Apache Spark >> source code to learn how to use spark to access HBase. But I met the >> following exception: >> Exception in thread "main" >> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after >> attempts=36, exceptions: >> Tue May 19 16:59:11 CST 2015, null, java.net.SocketTimeoutException: >> callTimeout=6, callDuration=68648: row 'spark_t01,,00' on >> table 'hbase:meta' at region=hbase:meta,,1.1588230740, >> hostname=bgdt01.dev.hrb,16020,1431412877700, seqNum=0 >> >> I also checked the RegionServer Log of the host "bgdt01.dev.hrb" listed >> in the above exception. I found a few entries like the following one: >> 2015-05-19 16:59:11,143 DEBUG >> [RpcServer.reader=2,bindAddress=bgdt01.dev.hrb,port=16020] ipc.RpcServer: >> RpcServer.listener,port=16020: Caught exception while >> reading:Authentication is required >> >> The above entry did not point to my program clearly. But the time is very >> near. Since my hbase version is HBase1.0.0 and I set security enabled, I >> doubt the exception was caused by the Kerberos authentication. But I am >> not sure. >> >> Do anybody know if my guess is right? And if I am right, could anybody >> tell me how to set Kerberos Authentication in a spark program? I don't know >> how to do it. I already checked the API doc , but did not found any API >> useful. Many Thanks! >> >> By the way, my spark version is 1.3.0. I also paste the code of >> "HBaseTest" in the following: >> ***Source Code** >> object HBaseTest { >> def main(args: Array[String]) { >> val sparkConf = new SparkConf().setAppName("HBaseTest") >> val sc = new SparkContext(sparkConf) >> val conf = HBaseConfiguration.create() >> conf.set(TableInputFormat.INPUT_TABLE, args(0)) >> >> // Initialize hBase table if necessary >> val admin = new HBaseAdmin(conf) >> if (!admin.isTableAvailable(args(0))) { >> val tableDesc = new HTableDescriptor(args(0)) >> admin.createTable(tableDesc) >> } >> >> val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat], >> classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable], >> classOf[org.apache.hadoop.hbase.client.Result]) >> >> hBaseRDD.count() >> >> sc.stop() >> } >> } >> >> > -- Many thanks. Bill
Re: How to use spark to access HBase with Security enabled
Please take a look at: http://hbase.apache.org/book.html#_client_side_configuration_for_secure_operation Cheers On Tue, May 19, 2015 at 5:23 AM, donhoff_h <165612...@qq.com> wrote: > > The principal is sp...@bgdt.dev.hrb. It is the user that I used to run my > spark programs. I am sure I have run the kinit command to make it take > effect. And I also used the HBase Shell to verify that this user has the > right to scan and put the tables in HBase. > > Now I still have no idea how to solve this problem. Can anybody help me to > figure it out? Many Thanks! > > -- 原始邮件 -- > *发件人:* "yuzhihong";; > *发送时间:* 2015年5月19日(星期二) 晚上7:55 > *收件人:* "donhoff_h"<165612...@qq.com>; > *抄送:* "user"; > *主题:* Re: How to use spark to access HBase with Security enabled > > Which user did you run your program as ? > > Have you granted proper permission on hbase side ? > > You should also check master log to see if there was some clue. > > Cheers > > > > On May 19, 2015, at 2:41 AM, donhoff_h <165612...@qq.com> wrote: > > Hi, experts. > > I ran the "HBaseTest" program which is an example from the Apache Spark > source code to learn how to use spark to access HBase. But I met the > following exception: > Exception in thread "main" > org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after > attempts=36, exceptions: > Tue May 19 16:59:11 CST 2015, null, java.net.SocketTimeoutException: > callTimeout=6, callDuration=68648: row 'spark_t01,,00' on > table 'hbase:meta' at region=hbase:meta,,1.1588230740, > hostname=bgdt01.dev.hrb,16020,1431412877700, seqNum=0 > > I also checked the RegionServer Log of the host "bgdt01.dev.hrb" listed in > the above exception. I found a few entries like the following one: > 2015-05-19 16:59:11,143 DEBUG > [RpcServer.reader=2,bindAddress=bgdt01.dev.hrb,port=16020] ipc.RpcServer: > RpcServer.listener,port=16020: Caught exception while > reading:Authentication is required > > The above entry did not point to my program clearly. But the time is very > near. Since my hbase version is HBase1.0.0 and I set security enabled, I > doubt the exception was caused by the Kerberos authentication. But I am > not sure. > > Do anybody know if my guess is right? And if I am right, could anybody > tell me how to set Kerberos Authentication in a spark program? I don't know > how to do it. I already checked the API doc , but did not found any API > useful. Many Thanks! > > By the way, my spark version is 1.3.0. I also paste the code of > "HBaseTest" in the following: > ***Source Code** > object HBaseTest { > def main(args: Array[String]) { > val sparkConf = new SparkConf().setAppName("HBaseTest") > val sc = new SparkContext(sparkConf) > val conf = HBaseConfiguration.create() > conf.set(TableInputFormat.INPUT_TABLE, args(0)) > > // Initialize hBase table if necessary > val admin = new HBaseAdmin(conf) > if (!admin.isTableAvailable(args(0))) { > val tableDesc = new HTableDescriptor(args(0)) > admin.createTable(tableDesc) > } > > val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat], > classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable], > classOf[org.apache.hadoop.hbase.client.Result]) > > hBaseRDD.count() > > sc.stop() > } > } > >
Re: How to use spark to access HBase with Security enabled
Which user did you run your program as ? Have you granted proper permission on hbase side ? You should also check master log to see if there was some clue. Cheers > On May 19, 2015, at 2:41 AM, donhoff_h <165612...@qq.com> wrote: > > Hi, experts. > > I ran the "HBaseTest" program which is an example from the Apache Spark > source code to learn how to use spark to access HBase. But I met the > following exception: > Exception in thread "main" > org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after > attempts=36, exceptions: > Tue May 19 16:59:11 CST 2015, null, java.net.SocketTimeoutException: > callTimeout=6, callDuration=68648: row 'spark_t01,,00' on > table 'hbase:meta' at region=hbase:meta,,1.1588230740, > hostname=bgdt01.dev.hrb,16020,1431412877700, seqNum=0 > > I also checked the RegionServer Log of the host "bgdt01.dev.hrb" listed in > the above exception. I found a few entries like the following one: > 2015-05-19 16:59:11,143 DEBUG > [RpcServer.reader=2,bindAddress=bgdt01.dev.hrb,port=16020] ipc.RpcServer: > RpcServer.listener,port=16020: Caught exception while reading:Authentication > is required > > The above entry did not point to my program clearly. But the time is very > near. Since my hbase version is HBase1.0.0 and I set security enabled, I > doubt the exception was caused by the Kerberos authentication. But I am not > sure. > > Do anybody know if my guess is right? And if I am right, could anybody tell > me how to set Kerberos Authentication in a spark program? I don't know how to > do it. I already checked the API doc , but did not found any API useful. Many > Thanks! > > By the way, my spark version is 1.3.0. I also paste the code of "HBaseTest" > in the following: > ***Source Code** > object HBaseTest { > def main(args: Array[String]) { > val sparkConf = new SparkConf().setAppName("HBaseTest") > val sc = new SparkContext(sparkConf) > val conf = HBaseConfiguration.create() > conf.set(TableInputFormat.INPUT_TABLE, args(0)) > > // Initialize hBase table if necessary > val admin = new HBaseAdmin(conf) > if (!admin.isTableAvailable(args(0))) { > val tableDesc = new HTableDescriptor(args(0)) > admin.createTable(tableDesc) > } > > val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat], > classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable], > classOf[org.apache.hadoop.hbase.client.Result]) > > hBaseRDD.count() > > sc.stop() > } > } >