Hi Ted,

The spark executors and hbase regions/masters are all collocated. This is a 2 
node test environment.

Best,
Eric

Eric Walk, Sr. Technical Consultant
p: 617.855.9255 |  NASDAQ: PRFT  |  Perficient.com<http://www.perficient.com/>








From: Ted Yu <yuzhih...@gmail.com>
Sent: Mar 18, 2015 2:46 PM
To: Eric Walk
Cc: user@spark.apache.org;Bill Busch
Subject: Re: Spark + HBase + Kerberos

Are hbase config / keytab files deployed on executor machines ?

Consider adding -Dsun.security.krb5.debug=true for debug purpose.

Cheers

On Wed, Mar 18, 2015 at 11:39 AM, Eric Walk 
<eric.w...@perficient.com<mailto:eric.w...@perficient.com>> wrote:
Having an issue connecting to HBase from a Spark container in a Secure Cluster. 
Haven’t been able to get past this issue, any thoughts would be appreciated.

We’re able to perform some operations like “CreateTable” in the driver thread 
successfully. Read requests (always in the executor threads) are always failing 
with:
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]

Logs and scala are attached, the names of the innocent have masked for their 
protection (in a consistent manner).

Executing the following spark job (using HDP 2.2, Spark 1.2.0, HBase 0.98.4, 
Kerberos on AD):
export 
SPARK_CLASSPATH=/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-server.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-protocol.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-hadoop2-compat.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-client.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-common.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/htrace-core-3.0.4.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/guava-12.0.1.jar:/usr/hdp/2.2.0.0-2041/hbase/conf

/usr/hdp/2.2.0.0-2041/spark/bin/spark-submit --class HBaseTest --driver-memory 
2g --executor-memory 1g --executor-cores 1 --num-executors 1 --master 
yarn-client ~/spark-test_2.10-1.0.jar

We see this error in the executor processes (attached as yarn log.txt):
2015-03-18 17:34:15,121 DEBUG [Executor task launch worker-0] 
security.HBaseSaslRpcClient: Creating SASL GSSAPI client. Server's Kerberos 
principal name is hbase/ldevawshdp0002.<dc1>.pvc@<dc1>.PVC
2015-03-18 17:34:15,128 WARN  [Executor task launch worker-0] ipc.RpcClient: 
Exception encountered while connecting to the server : 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]
2015-03-18 17:34:15,129 ERROR [Executor task launch worker-0] ipc.RpcClient: 
SASL authentication failed. The most likely cause is missing or invalid 
credentials. Consider 'kinit'.
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]

The HBase Master Logs show success:
2015-03-18 17:34:12,861 DEBUG [RpcServer.listener,port=60000] ipc.RpcServer: 
RpcServer.listener,port=60000: connection from 
10.4.0.6:46636<http://10.4.0.6:46636>; # active connections: 3
2015-03-18 17:34:12,872 DEBUG [RpcServer.reader=3,port=60000] ipc.RpcServer: 
Kerberos principal name is hbase/ldevawshdp0001.<dc1>.pvc@<DC1>.PVC
2015-03-18 17:34:12,875 DEBUG [RpcServer.reader=3,port=60000] ipc.RpcServer: 
Created SASL server with mechanism = GSSAPI
2015-03-18 17:34:12,875 DEBUG [RpcServer.reader=3,port=60000] ipc.RpcServer: 
Have read input token of size 1501 for processing by 
saslServer.evaluateResponse()
2015-03-18 17:34:12,876 DEBUG [RpcServer.reader=3,port=60000] ipc.RpcServer: 
Will send token of size 108 from saslServer.
2015-03-18 17:34:12,877 DEBUG [RpcServer.reader=3,port=60000] ipc.RpcServer: 
Have read input token of size 0 for processing by saslServer.evaluateResponse()
2015-03-18 17:34:12,878 DEBUG [RpcServer.reader=3,port=60000] ipc.RpcServer: 
Will send token of size 32 from saslServer.
2015-03-18 17:34:12,878 DEBUG [RpcServer.reader=3,port=60000] ipc.RpcServer: 
Have read input token of size 32 for processing by saslServer.evaluateResponse()
2015-03-18 17:34:12,879 DEBUG [RpcServer.reader=3,port=60000] 
security.HBaseSaslRpcServer: SASL server GSSAPI callback: setting canonicalized 
client ID: <user1>@<DC1>.PVC
2015-03-18 17:34:12,895 DEBUG [RpcServer.reader=3,port=60000] ipc.RpcServer: 
SASL server context established. Authenticated client: <user1>@<DC1>.PVC 
(auth:SIMPLE). Negotiated QoP is auth
2015-03-18 17:34:29,313 DEBUG [RpcServer.reader=3,port=60000] ipc.RpcServer: 
RpcServer.listener,port=60000: DISCONNECTING client 
10.4.0.6:46636<http://10.4.0.6:46636> because read count=-1. Number of active 
connections: 3
2015-03-18 17:34:37,102 DEBUG [RpcServer.listener,port=60000] ipc.RpcServer: 
RpcServer.listener,port=60000: connection from 
10.4.0.6:46733<http://10.4.0.6:46733>; # active connections: 3
2015-03-18 17:34:37,102 DEBUG [RpcServer.reader=4,port=60000] ipc.RpcServer: 
RpcServer.listener,port=60000: DISCONNECTING client 
10.4.0.6:46733<http://10.4.0.6:46733> because read count=-1. Number of active 
connections: 3

The Spark Driver Console Output hangs at this point:
2015-03-18 17:34:13,337 INFO  [main] spark.DefaultExecutionContext: Starting 
job: count at HBaseTest.scala:63
2015-03-18 17:34:13,349 INFO  [sparkDriver-akka.actor.default-dispatcher-4] 
scheduler.DAGScheduler: Got job 0 (count at HBaseTest.scala:63) with 1 output 
partitions (allowLocal=false)
2015-03-18 17:34:13,350 INFO  [sparkDriver-akka.actor.default-dispatcher-4] 
scheduler.DAGScheduler: Final stage: Stage 0(count at HBaseTest.scala:63)
2015-03-18 17:34:13,350 INFO  [sparkDriver-akka.actor.default-dispatcher-4] 
scheduler.DAGScheduler: Parents of final stage: List()
2015-03-18 17:34:13,355 INFO  [sparkDriver-akka.actor.default-dispatcher-4] 
scheduler.DAGScheduler: Missing parents: List()
2015-03-18 17:34:13,362 INFO  [sparkDriver-akka.actor.default-dispatcher-4] 
scheduler.DAGScheduler: Submitting Stage 0 (NewHadoopRDD[0] at newAPIHadoopRDD 
at HBaseTest.scala:59), which has no missing parents
2015-03-18 17:34:13,377 INFO  [sparkDriver-akka.actor.default-dispatcher-4] 
storage.MemoryStore: ensureFreeSpace(1680) called with curMem=394981, 
maxMem=1111794647
2015-03-18 17:34:13,378 INFO  [sparkDriver-akka.actor.default-dispatcher-4] 
storage.MemoryStore: Block broadcast_1 stored as values in memory (estimated 
size 1680.0 B, free 1059.9 MB)
2015-03-18 17:34:13,380 INFO  [sparkDriver-akka.actor.default-dispatcher-4] 
storage.MemoryStore: ensureFreeSpace(1165) called with curMem=396661, 
maxMem=1111794647
2015-03-18 17:34:13,381 INFO  [sparkDriver-akka.actor.default-dispatcher-4] 
storage.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory 
(estimated size 1165.0 B, free 1059.9 MB)
2015-03-18 17:34:13,382 INFO  [sparkDriver-akka.actor.default-dispatcher-2] 
storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on 
ldevawshdp0001.<dc1>.pvc:36023 (size: 1165.0 B, free: 1060.2 MB)
2015-03-18 17:34:13,384 INFO  [sparkDriver-akka.actor.default-dispatcher-4] 
storage.BlockManagerMaster: Updated info of block broadcast_1_piece0
2015-03-18 17:34:13,392 INFO  [sparkDriver-akka.actor.default-dispatcher-4] 
spark.DefaultExecutionContext: Created broadcast 1 from broadcast at 
DAGScheduler.scala:838
2015-03-18 17:34:13,408 INFO  [sparkDriver-akka.actor.default-dispatcher-4] 
scheduler.DAGScheduler: Submitting 1 missing tasks from Stage 0 
(NewHadoopRDD[0] at newAPIHadoopRDD at HBaseTest.scala:59)
2015-03-18 17:34:13,409 INFO  [sparkDriver-akka.actor.default-dispatcher-4] 
cluster.YarnClientClusterScheduler: Adding task set 0.0 with 1 tasks
2015-03-18 17:34:13,416 INFO  [sparkDriver-akka.actor.default-dispatcher-4] 
util.RackResolver: Resolved ldevawshdp0002.<dc1>.pvc to /default-rack
2015-03-18 17:34:13,428 INFO  [sparkDriver-akka.actor.default-dispatcher-2] 
scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, 
ldevawshdp0001.<dc1>.pvc, RACK_LOCAL, 1384 bytes)
2015-03-18 17:34:13,846 INFO  [sparkDriver-akka.actor.default-dispatcher-2] 
storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on 
ldevawshdp0001.<dc1>.pvc:50375 (size: 1165.0 B, free: 530.3 MB)
2015-03-18 17:34:13,993 INFO  [sparkDriver-akka.actor.default-dispatcher-4] 
storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 
ldevawshdp0001.<dc1>.pvc:50375 (size: 53.2 KB, free: 530.2 MB)

Eric Walk, Sr. Technical Consultant
p: 617.855.9255<tel:617.855.9255> |  NASDAQ: PRFT  |  
Perficient.com<http://www.perficient.com/>
[cid:image002.jpg@01CE554B.9F51CF10]<http://www.perficient.com/>



[cid:image004.jpg@01CE554B.9F51CF10]<http://www.linkedin.com/groups/Perficient-Inc-78909?home=&gid=78909&trk=anet_ug_hm>[cid:image005.jpg@01CE554B.9F51CF10]<http://www.twitter.com/perficient>[cid:image006.jpg@01CE554B.9F51CF10]<http://www.perficient.com/Thought-Leadership/Social-Media/Blogs>[cid:image007.jpg@01CE554B.9F51CF10]<http://www.facebook.com/perficient>[cid:image008.jpg@01CE554B.9F51CF10]<http://www.youtube.com/perficient>

[cid:image009.jpg@01CE554B.9F51CF10]





---------------------------------------------------------------------
To unsubscribe, e-mail: 
user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org>
For additional commands, e-mail: 
user-h...@spark.apache.org<mailto:user-h...@spark.apache.org>

Reply via email to