hi, Due to the fact we are not on Hbase 2.00 we are using SparkOnHbase.
Dependency: <dependency> <groupId>com.cloudera</groupId> <artifactId>spark-hbase</artifactId> <version>0.0.2-clabs</version> </dependency> It is quite a small snippet of code. For a general scan using a start and stop time as the scan time range. val conf = new SparkConf(). set("spark.shuffle.consolidateFiles", "true"). set("spark.kryo.registrationRequired", "false"). set("spark.serializer", "org.apache.spark.serializer.KryoSerializer"). set("spark.kryoserializer.buffer", "30m"). set("spark.shuffle.spill", "true"). set("spark.shuffle.memoryFraction", "0.4") val sc = new SparkContext(conf) val scan = new Scan() scan.addColumn(columnName, "column1") scan.setTimeRange(scanRowStartTs, scanRowStopTs) hc.hbaseRDD(inputTableName,scan,filter) To run just use the following: spark-submit --class ClassName --master yarn-client --driver-memory 2000M --executor-memory 5G --keytab <location of keytab> --principal <location of principal> That should work in a general way. Obviously you can utilise other scan / put / gets etc methods. Thanks, Nkechi On 9 August 2016 at 15:20, Aneela Saleem <ane...@platalytics.com> wrote: > Thanks Nkechi, > > Can you please direct me to some code snippet with hbase on spark module? > I've been trying that for last few days but did not found a workaround. > > > > On Tue, Aug 9, 2016 at 6:13 PM, Nkechi Achara <nkach...@googlemail.com> > wrote: > > > Hey, > > > > Have you tried hbase on spark module, or the spark-hbase module to > connect? > > The principal and keytab options should work out of the box for > kerberized > > access. I can attempt your code if you don't have the ability to use > those > > modules. > > > > Thanks > > K > > > > On 9 Aug 2016 2:25 p.m., "Aneela Saleem" <ane...@platalytics.com> wrote: > > > > > Hi all, > > > > > > I'm trying to connect to Hbase with security enabled using spark job. I > > > have kinit'd from command line. When i run the following job i.e., > > > > > > /usr/local/spark-2/bin/spark-submit --keytab > > /etc/hadoop/conf/spark.keytab > > > --principal spark/hadoop-master@platalyticsrealm --class > > > com.platalytics.example.spark.App --master yarn --driver-class-path > > > /root/hbase-1.2.2/conf /home/vm6/project-1-jar-with-dependencies.jar > > > > > > I get the error: > > > > > > 2016-08-07 20:43:57,617 WARN > > > [hconnection-0x24b5fa45-metaLookup-shared--pool2-t1] > > > ipc.RpcClientImpl: Exception encountered while connecting to the > server : > > > javax.security.sasl.SaslException: GSS initiate failed [Caused by > > > GSSException: No valid credentials provided (Mechanism level: Failed to > > > find any Kerberos tgt)] 2016-08-07 20:43:57,619 ERROR > > > [hconnection-0x24b5fa45-metaLookup-shared--pool2-t1] > ipc.RpcClientImpl: > > > SASL authentication failed. The most likely cause is missing or invalid > > > credentials. Consider 'kinit'. javax.security.sasl.SaslException: GSS > > > initiate failed [Caused by GSSException: No valid credentials provided > > > (Mechanism level: Failed to find any Kerberos tgt)] at > > > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge( > > > GssKrb5Client.java:212) > > > at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect( > > > HBaseSaslRpcClient.java:179) > > > at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection. > > > setupSaslConnection(RpcClientImpl.java:617) > > > at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection. > > > access$700(RpcClientImpl.java:162) at org.apache.hadoop.hbase.ipc. > > > RpcClientImpl$Connection$2.run(RpcClientImpl.java:743) > > > > > > Following is my code: > > > > > > System.setProperty("java.security.krb5.conf", "/etc/krb5.conf"); > > > System.setProperty("java.security.auth.login.config", > > > "/etc/hbase/conf/zk-jaas.conf"); > > > > > > val hconf = HBaseConfiguration.create() > > > val tableName = "emp" > > > hconf.set("hbase.zookeeper.quorum", "hadoop-master") > > > hconf.set(TableInputFormat.INPUT_TABLE, tableName) > > > hconf.set("hbase.zookeeper.property.clientPort", "2181") > > > hconf.set("hadoop.security.authentication", "kerberos") > > > hconf.set("hbase.security.authentication", "kerberos") > > > hconf.addResource(new Path("/etc/hbase/conf/core-site.xml")) > > > hconf.addResource(new Path("/etc/hbase/conf/hbase-site.xml")) > > > UserGroupInformation.setConfiguration(hconf) > > > val keyTab = "/etc/hadoop/conf/spark.keytab" > > > val ugi = UserGroupInformation.loginUserFromKeytabAndReturnUG > > > I("spark/hadoop-master@platalyticsrealm", keyTab) > > > UserGroupInformation.setLoginUser(ugi) > > > ugi.doAs(new PrivilegedExceptionAction[Void]() { > > > override def run(): Void = { > > > val conf = new SparkConf > > > val sc = new SparkContext(conf) > > > sc.addFile(keyTab) > > > var hBaseRDD = sc.newAPIHadoopRDD(hconf, classOf[TableInputFormat], > > > classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable], > > > classOf[org.apache.hadoop.hbase.client.Result]) > > > println("Number of Records found : " + hBaseRDD.count()) > > > hBaseRDD.foreach(x => { > > > println(new String(x._2.getRow())) > > > }) > > > sc.stop() > > > return null > > > } > > > }) > > > > > > Please have a look. And help me try finding the issue. > > > > > > Thanks > > > > > >