Thanks Jacek!
I have already set hbase.security.authentication property set to kerberos,
since Hbase with kerberos is working fine.
I tested again after correcting the typo but got same error. Following is
the code, Please have a look:
System.setProperty("java.security.krb5.conf", "/etc/krb5.conf");
System.setProperty("java.security.auth.login.config",
"/etc/hbase/conf/zk-jaas.conf");
val hconf = HBaseConfiguration.create()
val tableName = "emp"
hconf.set("hbase.zookeeper.quorum", "hadoop-master")
hconf.set(TableInputFormat.INPUT_TABLE, tableName)
hconf.set("hbase.zookeeper.property.clientPort", "2181")
hconf.set("hbase.master", "hadoop-master:60000")
hconf.set("hadoop.security.authentication", "kerberos")
hconf.addResource(new Path("/etc/hbase/conf/core-site.xml"))
hconf.addResource(new Path("/etc/hbase/conf/hbase-site.xml"))
val conf = new SparkConf()
conf.set("spark.yarn.security.tokens.hbase.enabled", "true")
conf.set("spark.authenticate", "true")
conf.set("spark.authenticate.secret","None")
val sc = new SparkContext(conf)
val hBaseRDD = sc.newAPIHadoopRDD(hconf, classOf[TableInputFormat],
classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable],
classOf[org.apache.hadoop.hbase.client.Result])
val count = hBaseRDD.count()
print("HBase RDD count:" + count)
On Sat, Aug 13, 2016 at 8:36 PM, Jacek Laskowski <[email protected]> wrote:
> Hi Aneela,
>
> My (little to no) understanding of how to make it work is to use
> hbase.security.authentication property set to kerberos (see [1]).
>
> Spark on YARN uses it to get the tokens for Hive, HBase et al (see
> [2]). It happens when Client starts conversation to YARN RM (see [3]).
>
> You should not do that yourself (and BTW you've got a typo in
> spark.yarn.security.tokens.habse.enabled setting). I think that the
> entire code you pasted matches the code Spark's doing itself before
> requesting resources from YARN.
>
> Give it a shot and report back since I've never worked in such a
> configuration and would love improving in this (security) area.
> Thanks!
>
> [1] http://www.cloudera.com/documentation/enterprise/5-5-
> x/topics/cdh_sg_hbase_authentication.html#concept_
> zyz_vg5_nt__section_s1l_nwv_ls
> [2] https://github.com/apache/spark/blob/master/yarn/src/
> main/scala/org/apache/spark/deploy/yarn/security/
> HBaseCredentialProvider.scala#L58
> [3] https://github.com/apache/spark/blob/master/yarn/src/
> main/scala/org/apache/spark/deploy/yarn/Client.scala#L396
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Fri, Aug 12, 2016 at 11:30 PM, Aneela Saleem <[email protected]>
> wrote:
> > Thanks for your response Jacek!
> >
> > Here is the code, how spark accesses HBase:
> > System.setProperty("java.security.krb5.conf", "/etc/krb5.conf");
> > System.setProperty("java.security.auth.login.config",
> > "/etc/hbase/conf/zk-jaas.conf");
> > val hconf = HBaseConfiguration.create()
> > val tableName = "emp"
> > hconf.set("hbase.zookeeper.quorum", "hadoop-master")
> > hconf.set(TableInputFormat.INPUT_TABLE, tableName)
> > hconf.set("hbase.zookeeper.property.clientPort", "2181")
> > hconf.set("hbase.master", "hadoop-master:60000")
> > hconf.set("hadoop.security.authentication", "kerberos")
> > hconf.set("hbase.security.authentication", "kerberos")
> > hconf.addResource(new Path("/etc/hbase/conf/core-site.xml"))
> > hconf.addResource(new Path("/etc/hbase/conf/hbase-site.xml"))
> > UserGroupInformation.setConfiguration(hconf)
> > UserGroupInformation.loginUserFromKeytab("spark@platalyticsrealm",
> > "/etc/hadoop/conf/sp.keytab")
> > conf.set("spark.yarn.security.tokens.habse.enabled", "true")
> > conf.set("hadoop.security.authentication", "true")
> > conf.set("hbase.security.authentication", "true")
> > conf.set("spark.authenticate", "true")
> > conf.set("spark.authenticate.secret","None")
> > val sc = new SparkContext(conf)
> > UserGroupInformation.setConfiguration(hconf)
> > val keyTab = "/etc/hadoop/conf/sp.keytab"
> > val ugi =
> > UserGroupInformation.loginUserFromKeytabAndReturnUG
> I("spark/hadoop-master@platalyticsrealm",
> > keyTab)
> > UserGroupInformation.setLoginUser(ugi)
> > HBaseAdmin.checkHBaseAvailable(hconf);
> > ugi.doAs(new PrivilegedExceptionAction[Void]() {
> > override def run(): Void = {
> > val conf = new SparkConf().set("spark.shuffle.consolidateFiles", "true")
> >
> > val sc = new SparkContext(conf)
> > val hbaseContext = new HBaseContext(sc, hconf)
> >
> > val scan = new Scan()
> > scan.addColumn(columnName, "column1")
> > scan.setTimeRange(0L, 1416083300000L)
> > val rdd = hbaseContext.hbaseRDD("emp", scan)
> > println(rdd.count)
> > rdd.saveAsTextFile("hdfs://hadoop-master:8020/hbaseTemp/")
> > sc.stop()
> > return null
> > }
> > })
> > I have tried it with both Spark versions, 20 and 1.5.3 but same exception
> > was thrown.
> >
> > I floated this email on HBase community as well, they recommended me to
> use
> > SparkOnHbase cloudera library. And asked to try the above cod but nothing
> > works. I'm stuck here.
> >
> >
> > On Sat, Aug 13, 2016 at 7:07 AM, Jacek Laskowski <[email protected]>
> wrote:
> >>
> >> Hi,
> >>
> >> How do you access HBase? What's the version of Spark?
> >>
> >> (I don't see spark packages in the stack trace)
> >>
> >> Pozdrawiam,
> >> Jacek Laskowski
> >> ----
> >> https://medium.com/@jaceklaskowski/
> >> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
> >> Follow me at https://twitter.com/jaceklaskowski
> >>
> >>
> >> On Sun, Aug 7, 2016 at 9:02 AM, Aneela Saleem <[email protected]>
> >> wrote:
> >> > Hi all,
> >> >
> >> > I'm trying to run a spark job that accesses HBase with security
> enabled.
> >> > When i run the following command:
> >> >
> >> > /usr/local/spark-2/bin/spark-submit --keytab
> >> > /etc/hadoop/conf/spark.keytab
> >> > --principal spark/hadoop-master@platalyticsrealm --class
> >> > com.platalytics.example.spark.App --master yarn --driver-class-path
> >> > /root/hbase-1.2.2/conf /home/vm6/project-1-jar-with-dependencies.jar
> >> >
> >> >
> >> > I get the following error:
> >> >
> >> >
> >> > 2016-08-07 20:43:57,617 WARN
> >> > [hconnection-0x24b5fa45-metaLookup-shared--pool2-t1]
> ipc.RpcClientImpl:
> >> > Exception encountered while connecting to the server :
> >> > javax.security.sasl.SaslException: GSS initiate failed [Caused by
> >> > GSSException: No valid credentials provided (Mechanism level: Failed
> to
> >> > find
> >> > any Kerberos tgt)]
> >> > 2016-08-07 20:43:57,619 ERROR
> >> > [hconnection-0x24b5fa45-metaLookup-shared--pool2-t1]
> ipc.RpcClientImpl:
> >> > SASL
> >> > authentication failed. The most likely cause is missing or invalid
> >> > credentials. Consider 'kinit'.
> >> > javax.security.sasl.SaslException: GSS initiate failed [Caused by
> >> > GSSException: No valid credentials provided (Mechanism level: Failed
> to
> >> > find
> >> > any Kerberos tgt)]
> >> > at
> >> >
> >> > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(
> GssKrb5Client.java:212)
> >> > at
> >> >
> >> > org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(
> HBaseSaslRpcClient.java:179)
> >> > at
> >> >
> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.
> setupSaslConnection(RpcClientImpl.java:617)
> >> > at
> >> >
> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.
> access$700(RpcClientImpl.java:162)
> >> > at
> >> >
> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.
> run(RpcClientImpl.java:743)
> >> > at
> >> >
> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.
> run(RpcClientImpl.java:740)
> >> > at java.security.AccessController.doPrivileged(Native Method)
> >> > at javax.security.auth.Subject.doAs(Subject.java:415)
> >> > at
> >> >
> >> > org.apache.hadoop.security.UserGroupInformation.doAs(
> UserGroupInformation.java:1657)
> >> > at
> >> >
> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.
> setupIOstreams(RpcClientImpl.java:740)
> >> > at
> >> >
> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.
> writeRequest(RpcClientImpl.java:906)
> >> > at
> >> >
> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.
> tracedWriteRequest(RpcClientImpl.java:873)
> >> > at
> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl.call(
> RpcClientImpl.java:1241)
> >> > at
> >> >
> >> > org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(
> AbstractRpcClient.java:227)
> >> > at
> >> >
> >> > org.apache.hadoop.hbase.ipc.AbstractRpcClient$
> BlockingRpcChannelImplementation.callBlockingMethod(
> AbstractRpcClient.java:336)
> >> > at
> >> >
> >> > org.apache.hadoop.hbase.protobuf.generated.
> ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:34094)
> >> > at
> >> >
> >> > org.apache.hadoop.hbase.client.ClientSmallScanner$
> SmallScannerCallable.call(ClientSmallScanner.java:201)
> >> > at
> >> >
> >> > org.apache.hadoop.hbase.client.ClientSmallScanner$
> SmallScannerCallable.call(ClientSmallScanner.java:180)
> >> > at
> >> >
> >> > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(
> RpcRetryingCaller.java:210)
> >> > at
> >> >
> >> > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$
> RetryingRPC.call(ScannerCallableWithReplicas.java:360)
> >> > at
> >> >
> >> > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$
> RetryingRPC.call(ScannerCallableWithReplicas.java:334)
> >> > at
> >> >
> >> > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(
> RpcRetryingCaller.java:136)
> >> > at
> >> >
> >> > org.apache.hadoop.hbase.client.ResultBoundedCompletionService
> $QueueingFuture.run(ResultBoundedCompletionService.java:65)
> >> > at
> >> >
> >> > java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
> >> > at
> >> >
> >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
> >> > at java.lang.Thread.run(Thread.java:745)
> >> > Caused by: GSSException: No valid credentials provided (Mechanism
> level:
> >> > Failed to find any Kerberos tgt)
> >> > at
> >> >
> >> > sun.security.jgss.krb5.Krb5InitCredential.getInstance(
> Krb5InitCredential.java:147)
> >> > at
> >> >
> >> > sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(
> Krb5MechFactory.java:121)
> >> > at
> >> >
> >> > sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(
> Krb5MechFactory.java:187)
> >> > at
> >> >
> >> > sun.security.jgss.GSSManagerImpl.getMechanismContext(
> GSSManagerImpl.java:223)
> >> > at
> >> > sun.security.jgss.GSSContextImpl.initSecContext(
> GSSContextImpl.java:212)
> >> > at
> >> > sun.security.jgss.GSSContextImpl.initSecContext(
> GSSContextImpl.java:179)
> >> > at
> >> >
> >> > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(
> GssKrb5Client.java:193)
> >> > ... 25 more
> >> >
> >> >
> >> > I have Spark running on Yarn with security enabled. I have kinit'd
> from
> >> > console and have provided necessarry principals and keytabs. Can you
> >> > please
> >> > help me find out the issue?
> >> >
> >> >
> >> > Thanks
> >
> >
>