Hi Ted, The spark executors and hbase regions/masters are all collocated. This is a 2 node test environment.
Best, Eric Eric Walk, Sr. Technical Consultant p: 617.855.9255 | NASDAQ: PRFT | Perficient.com<http://www.perficient.com/> From: Ted Yu <yuzhih...@gmail.com> Sent: Mar 18, 2015 2:46 PM To: Eric Walk Cc: user@spark.apache.org;Bill Busch Subject: Re: Spark + HBase + Kerberos Are hbase config / keytab files deployed on executor machines ? Consider adding -Dsun.security.krb5.debug=true for debug purpose. Cheers On Wed, Mar 18, 2015 at 11:39 AM, Eric Walk <eric.w...@perficient.com<mailto:eric.w...@perficient.com>> wrote: Having an issue connecting to HBase from a Spark container in a Secure Cluster. Haven’t been able to get past this issue, any thoughts would be appreciated. We’re able to perform some operations like “CreateTable” in the driver thread successfully. Read requests (always in the executor threads) are always failing with: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] Logs and scala are attached, the names of the innocent have masked for their protection (in a consistent manner). Executing the following spark job (using HDP 2.2, Spark 1.2.0, HBase 0.98.4, Kerberos on AD): export SPARK_CLASSPATH=/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-server.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-protocol.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-hadoop2-compat.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-client.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-common.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/htrace-core-3.0.4.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/guava-12.0.1.jar:/usr/hdp/2.2.0.0-2041/hbase/conf /usr/hdp/2.2.0.0-2041/spark/bin/spark-submit --class HBaseTest --driver-memory 2g --executor-memory 1g --executor-cores 1 --num-executors 1 --master yarn-client ~/spark-test_2.10-1.0.jar We see this error in the executor processes (attached as yarn log.txt): 2015-03-18 17:34:15,121 DEBUG [Executor task launch worker-0] security.HBaseSaslRpcClient: Creating SASL GSSAPI client. Server's Kerberos principal name is hbase/ldevawshdp0002.<dc1>.pvc@<dc1>.PVC 2015-03-18 17:34:15,128 WARN [Executor task launch worker-0] ipc.RpcClient: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] 2015-03-18 17:34:15,129 ERROR [Executor task launch worker-0] ipc.RpcClient: SASL authentication failed. The most likely cause is missing or invalid credentials. Consider 'kinit'. javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] The HBase Master Logs show success: 2015-03-18 17:34:12,861 DEBUG [RpcServer.listener,port=60000] ipc.RpcServer: RpcServer.listener,port=60000: connection from 10.4.0.6:46636<http://10.4.0.6:46636>; # active connections: 3 2015-03-18 17:34:12,872 DEBUG [RpcServer.reader=3,port=60000] ipc.RpcServer: Kerberos principal name is hbase/ldevawshdp0001.<dc1>.pvc@<DC1>.PVC 2015-03-18 17:34:12,875 DEBUG [RpcServer.reader=3,port=60000] ipc.RpcServer: Created SASL server with mechanism = GSSAPI 2015-03-18 17:34:12,875 DEBUG [RpcServer.reader=3,port=60000] ipc.RpcServer: Have read input token of size 1501 for processing by saslServer.evaluateResponse() 2015-03-18 17:34:12,876 DEBUG [RpcServer.reader=3,port=60000] ipc.RpcServer: Will send token of size 108 from saslServer. 2015-03-18 17:34:12,877 DEBUG [RpcServer.reader=3,port=60000] ipc.RpcServer: Have read input token of size 0 for processing by saslServer.evaluateResponse() 2015-03-18 17:34:12,878 DEBUG [RpcServer.reader=3,port=60000] ipc.RpcServer: Will send token of size 32 from saslServer. 2015-03-18 17:34:12,878 DEBUG [RpcServer.reader=3,port=60000] ipc.RpcServer: Have read input token of size 32 for processing by saslServer.evaluateResponse() 2015-03-18 17:34:12,879 DEBUG [RpcServer.reader=3,port=60000] security.HBaseSaslRpcServer: SASL server GSSAPI callback: setting canonicalized client ID: <user1>@<DC1>.PVC 2015-03-18 17:34:12,895 DEBUG [RpcServer.reader=3,port=60000] ipc.RpcServer: SASL server context established. Authenticated client: <user1>@<DC1>.PVC (auth:SIMPLE). Negotiated QoP is auth 2015-03-18 17:34:29,313 DEBUG [RpcServer.reader=3,port=60000] ipc.RpcServer: RpcServer.listener,port=60000: DISCONNECTING client 10.4.0.6:46636<http://10.4.0.6:46636> because read count=-1. Number of active connections: 3 2015-03-18 17:34:37,102 DEBUG [RpcServer.listener,port=60000] ipc.RpcServer: RpcServer.listener,port=60000: connection from 10.4.0.6:46733<http://10.4.0.6:46733>; # active connections: 3 2015-03-18 17:34:37,102 DEBUG [RpcServer.reader=4,port=60000] ipc.RpcServer: RpcServer.listener,port=60000: DISCONNECTING client 10.4.0.6:46733<http://10.4.0.6:46733> because read count=-1. Number of active connections: 3 The Spark Driver Console Output hangs at this point: 2015-03-18 17:34:13,337 INFO [main] spark.DefaultExecutionContext: Starting job: count at HBaseTest.scala:63 2015-03-18 17:34:13,349 INFO [sparkDriver-akka.actor.default-dispatcher-4] scheduler.DAGScheduler: Got job 0 (count at HBaseTest.scala:63) with 1 output partitions (allowLocal=false) 2015-03-18 17:34:13,350 INFO [sparkDriver-akka.actor.default-dispatcher-4] scheduler.DAGScheduler: Final stage: Stage 0(count at HBaseTest.scala:63) 2015-03-18 17:34:13,350 INFO [sparkDriver-akka.actor.default-dispatcher-4] scheduler.DAGScheduler: Parents of final stage: List() 2015-03-18 17:34:13,355 INFO [sparkDriver-akka.actor.default-dispatcher-4] scheduler.DAGScheduler: Missing parents: List() 2015-03-18 17:34:13,362 INFO [sparkDriver-akka.actor.default-dispatcher-4] scheduler.DAGScheduler: Submitting Stage 0 (NewHadoopRDD[0] at newAPIHadoopRDD at HBaseTest.scala:59), which has no missing parents 2015-03-18 17:34:13,377 INFO [sparkDriver-akka.actor.default-dispatcher-4] storage.MemoryStore: ensureFreeSpace(1680) called with curMem=394981, maxMem=1111794647 2015-03-18 17:34:13,378 INFO [sparkDriver-akka.actor.default-dispatcher-4] storage.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 1680.0 B, free 1059.9 MB) 2015-03-18 17:34:13,380 INFO [sparkDriver-akka.actor.default-dispatcher-4] storage.MemoryStore: ensureFreeSpace(1165) called with curMem=396661, maxMem=1111794647 2015-03-18 17:34:13,381 INFO [sparkDriver-akka.actor.default-dispatcher-4] storage.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1165.0 B, free 1059.9 MB) 2015-03-18 17:34:13,382 INFO [sparkDriver-akka.actor.default-dispatcher-2] storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on ldevawshdp0001.<dc1>.pvc:36023 (size: 1165.0 B, free: 1060.2 MB) 2015-03-18 17:34:13,384 INFO [sparkDriver-akka.actor.default-dispatcher-4] storage.BlockManagerMaster: Updated info of block broadcast_1_piece0 2015-03-18 17:34:13,392 INFO [sparkDriver-akka.actor.default-dispatcher-4] spark.DefaultExecutionContext: Created broadcast 1 from broadcast at DAGScheduler.scala:838 2015-03-18 17:34:13,408 INFO [sparkDriver-akka.actor.default-dispatcher-4] scheduler.DAGScheduler: Submitting 1 missing tasks from Stage 0 (NewHadoopRDD[0] at newAPIHadoopRDD at HBaseTest.scala:59) 2015-03-18 17:34:13,409 INFO [sparkDriver-akka.actor.default-dispatcher-4] cluster.YarnClientClusterScheduler: Adding task set 0.0 with 1 tasks 2015-03-18 17:34:13,416 INFO [sparkDriver-akka.actor.default-dispatcher-4] util.RackResolver: Resolved ldevawshdp0002.<dc1>.pvc to /default-rack 2015-03-18 17:34:13,428 INFO [sparkDriver-akka.actor.default-dispatcher-2] scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, ldevawshdp0001.<dc1>.pvc, RACK_LOCAL, 1384 bytes) 2015-03-18 17:34:13,846 INFO [sparkDriver-akka.actor.default-dispatcher-2] storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on ldevawshdp0001.<dc1>.pvc:50375 (size: 1165.0 B, free: 530.3 MB) 2015-03-18 17:34:13,993 INFO [sparkDriver-akka.actor.default-dispatcher-4] storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on ldevawshdp0001.<dc1>.pvc:50375 (size: 53.2 KB, free: 530.2 MB) Eric Walk, Sr. Technical Consultant p: 617.855.9255<tel:617.855.9255> | NASDAQ: PRFT | Perficient.com<http://www.perficient.com/> [cid:image002.jpg@01CE554B.9F51CF10]<http://www.perficient.com/> [cid:image004.jpg@01CE554B.9F51CF10]<http://www.linkedin.com/groups/Perficient-Inc-78909?home=&gid=78909&trk=anet_ug_hm>[cid:image005.jpg@01CE554B.9F51CF10]<http://www.twitter.com/perficient>[cid:image006.jpg@01CE554B.9F51CF10]<http://www.perficient.com/Thought-Leadership/Social-Media/Blogs>[cid:image007.jpg@01CE554B.9F51CF10]<http://www.facebook.com/perficient>[cid:image008.jpg@01CE554B.9F51CF10]<http://www.youtube.com/perficient> [cid:image009.jpg@01CE554B.9F51CF10] --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org> For additional commands, e-mail: user-h...@spark.apache.org<mailto:user-h...@spark.apache.org>