RE: HBase / Spark Kerberos problem

philipp.meyerhoefer Thu, 19 May 2016 05:12:08 -0700

Thanks Tom & John!

modifying spark-env.sh did the trick - my last line in the file is now:

export SPARK_DIST_CLASSPATH=$(paste -sd: "$SELF/classpath.txt"):`hbase 
classpath`:/etc/hbase/conf:/etc/hbase/conf/hbase-site.xml

Now o.a.s.d.y.Client logs “Added HBase security token to credentials” and the 
.count() on my HBase RDD works fine.

From: Ellis, Tom (Financial Markets IT) [mailto:tom.el...@lloydsbanking.com] 
Sent: 19 May 2016 09:51
To: 'John Trengrove'; Meyerhoefer, Philipp (TR Technology & Ops)
Cc: user
Subject: RE: HBase / Spark Kerberos problem

Yeah we ran into this issue. Key part is to have the hbase jars and 
hbase-site.xml config on the classpath of the spark submitter.

We did it slightly differently from Y Bodnar, where we set the required jars 
and config on the env var SPARK_DIST_CLASSPATH in our spark env file (rather 
than SPARK_CLASSPATH which is deprecated).

With this and –principal/--keytab, if you turn DEBUG logging for 
org.apache.spark.deploy.yarn you should see “Added HBase security token to 
credentials.”

Otherwise you should at least hopefully see the error where it fails to add the 
HBase tokens.

Check out the source of Client [1] and YarnSparkHadoopUtil  [2] – you’ll see 
how obtainTokenForHBase is being done.

It’s a bit confusing as to why it says you haven’t kinited even when you do 
loginUserFromKeytab – I haven’t quite worked through the reason for that yet.

Cheers,

Tom Ellis
telli...@gmail.com

[1] 
https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
[2] 
https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala

From: John Trengrove [mailto:john.trengr...@servian.com.au] 
Sent: 19 May 2016 08:09
To: philipp.meyerhoe...@thomsonreuters.com
Cc: user
Subject: Re: HBase / Spark Kerberos problem

-- This email has reached the Bank via an external source -- 

Have you had a look at this issue?

https://issues.apache.org/jira/browse/SPARK-12279 

There is a comment by Y Bodnar on how they successfully got Kerberos and HBase 
working.

2016-05-18 18:13 GMT+10:00 <philipp.meyerhoe...@thomsonreuters.com>:
Hi all,

I have been puzzling over a Kerberos problem for a while now and wondered if 
anyone can help.

For spark-submit, I specify --keytab x --principal y, which creates my 
SparkContext fine.
Connections to Zookeeper Quorum to find the HBase master work well too.
But when it comes to a .count() action on the RDD, I am always presented with 
the stack trace at the end of this mail.

We are using CDH5.5.2 (spark 1.5.0), and com.cloudera.spark.hbase.HBaseContext 
is a wrapper around TableInputFormat/hadoopRDD (see 
https://github.com/cloudera-labs/SparkOnHBase), as you can see in the stack 
trace.

Am I doing something obvious wrong here?
A similar flow, inside test code, works well, only going via spark-submit 
exposes this issue.

Code snippet (I have tried using the commented-out lines in various 
combinations, without success):

   val conf = new SparkConf().
      set("spark.shuffle.consolidateFiles", "true").
      set("spark.kryo.registrationRequired", "false").
      set("spark.serializer", "org.apache.spark.serializer.KryoSerializer").
      set("spark.kryoserializer.buffer", "30m")
    val sc = new SparkContext(conf)
    val cfg = sc.hadoopConfiguration
//    cfg.addResource(new 
org.apache.hadoop.fs.Path("/etc/hbase/conf/hbase-site.xml"))
//    
UserGroupInformation.getCurrentUser.setAuthenticationMethod(UserGroupInformation.AuthenticationMethod.KERBEROS)
//    cfg.set("hbase.security.authentication", "kerberos")
    val hc = new HBaseContext(sc, cfg)
    val scan = new Scan
    scan.setTimeRange(startMillis, endMillis)
    val matchesInRange = hc.hbaseRDD(MY_TABLE, scan, resultToMatch)
    val cnt = matchesInRange.count()
    log.info(s"matches in range $cnt")

Stack trace / log:

16/05/17 17:04:47 INFO SparkContext: Starting job: count at Analysis.scala:93
16/05/17 17:04:47 INFO DAGScheduler: Got job 0 (count at Analysis.scala:93) 
with 1 output partitions
16/05/17 17:04:47 INFO DAGScheduler: Final stage: ResultStage 0(count at 
Analysis.scala:93)
16/05/17 17:04:47 INFO DAGScheduler: Parents of final stage: List()
16/05/17 17:04:47 INFO DAGScheduler: Missing parents: List()
16/05/17 17:04:47 INFO DAGScheduler: Submitting ResultStage 0 
(MapPartitionsRDD[1] at map at HBaseContext.scala:580), which has no missing 
parents
16/05/17 17:04:47 INFO MemoryStore: ensureFreeSpace(3248) called with 
curMem=428022, maxMem=244187136
16/05/17 17:04:47 INFO MemoryStore: Block broadcast_3 stored as values in 
memory (estimated size 3.2 KB, free 232.5 MB)
16/05/17 17:04:47 INFO MemoryStore: ensureFreeSpace(2022) called with 
curMem=431270, maxMem=244187136
16/05/17 17:04:47 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in 
memory (estimated size 2022.0 B, free 232.5 MB)
16/05/17 17:04:47 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on 
10.6.164.40:33563 (size: 2022.0 B, free: 232.8 MB)
16/05/17 17:04:47 INFO SparkContext: Created broadcast 3 from broadcast at 
DAGScheduler.scala:861
16/05/17 17:04:47 INFO DAGScheduler: Submitting 1 missing tasks from 
ResultStage 0 (MapPartitionsRDD[1] at map at HBaseContext.scala:580)
16/05/17 17:04:47 INFO YarnScheduler: Adding task set 0.0 with 1 tasks
16/05/17 17:04:47 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, 
hpg-dev-vm, partition 0,PROCESS_LOCAL, 2208 bytes)
16/05/17 17:04:47 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on 
hpg-dev-vm:52698 (size: 2022.0 B, free: 388.4 MB)
16/05/17 17:04:48 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 
hpg-dev-vm:52698 (size: 26.0 KB, free: 388.4 MB)
16/05/17 17:04:57 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, 
hpg-dev-vm): org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't 
get the location
        at 
org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:308)
        at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:155)
        at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:63)
        at 
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
        at 
org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:314)
        at 
org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:289)
        at 
org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:161)
        at 
org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:156)
        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:888)
        at 
org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.restart(TableRecordReaderImpl.java:90)
        at 
org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.initialize(TableRecordReaderImpl.java:167)
        at 
org.apache.hadoop.hbase.mapreduce.TableRecordReader.initialize(TableRecordReader.java:138)
        at 
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$1.initialize(TableInputFormatBase.java:200)
        at 
org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:153)
        at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:124)
        at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:65)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
        at org.apache.spark.scheduler.Task.run(Task.scala:88)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Could not set up IO Streams to hpg-dev-vm 
/127.0.0.1:60020
        at 
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:773)
        at 
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:890)
        at 
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:859)
        at 
org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1193)
        at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216)
        at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:300)
        at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:32627)
        at 
org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRowOrBefore(ProtobufUtil.java:1583)
        at 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1293)
        at 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1125)
        at 
org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:299)
        ... 26 more
Caused by: java.lang.RuntimeException: SASL authentication failed. The most 
likely cause is missing or invalid credentials. Consider 'kinit'.
        at 
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$1.run(RpcClientImpl.java:673)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
        at 
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.handleSaslConnectionFailure(RpcClientImpl.java:631)
        at 
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:739)
        ... 36 more
Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)]
        at 
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
        at 
org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179)
        at 
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:605)
        at 
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$600(RpcClientImpl.java:154)
        at 
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:731)
        at 
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:728)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
        at 
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:728)
        ... 36 more
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed 
to find any Kerberos tgt)
        at 
sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
        at 
sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
        at 
sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
        at 
sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
        at 
sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
        at 
sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
        at 
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
        ... 45 more

--
Philipp Meyerhoefer
Thomson Reuters
philipp.meyerhoe...@tr.com

________________________________

This e-mail is for the sole use of the intended recipient and contains 
information that may be privileged and/or confidential. If you are not an 
intended recipient, please notify the sender by return e-mail and delete this 
e-mail and any attachments. Certain required legal entity disclosures can be 
accessed on our website.<http://site.thomsonreuters.com/site/disclosures/>

Lloyds Banking Group plc. Registered Office: The Mound, Edinburgh EH1 1YZ. 
Registered in Scotland no. SC95000. Telephone: 0131 225 4555. Lloyds Bank plc. 
Registered Office: 25 Gresham Street, London EC2V 7HN. Registered in England 
and Wales no. 2065. Telephone 0207626 1500. Bank of Scotland plc. Registered 
Office: The Mound, Edinburgh EH1 1YZ. Registered in Scotland no. SC327000. 
Telephone: 03457 801 801. Cheltenham & Gloucester plc. Registered Office: 
Barnett Way, Gloucester GL4 3RL. Registered in England and Wales 2299428. 
Telephone: 0345 603 1637
Lloyds Bank plc, Bank of Scotland plc are authorised by the Prudential 
Regulation Authority and regulated by the Financial Conduct Authority and 
Prudential Regulation Authority.
Cheltenham & Gloucester plc is authorised and regulated by the Financial 
Conduct Authority.
Halifax is a division of Bank of Scotland plc. Cheltenham & Gloucester Savings 
is a division of Lloyds Bank plc.
HBOS plc. Registered Office: The Mound, Edinburgh EH1 1YZ. Registered in 
Scotland no. SC218813.
This e-mail (including any attachments) is private and confidential and may 
contain privileged material. If you have received this e-mail in error, please 
notify the sender and delete it (including any attachments) immediately. You 
must not copy, distribute, disclose or use any of the information in it or any 
attachments. Telephone calls may be monitored or recorded.

RE: HBase / Spark Kerberos problem

Reply via email to