Are hbase config / keytab files deployed on executor machines ? Consider adding -Dsun.security.krb5.debug=true for debug purpose.
Cheers On Wed, Mar 18, 2015 at 11:39 AM, Eric Walk <eric.w...@perficient.com> wrote: > Having an issue connecting to HBase from a Spark container in a Secure > Cluster. Haven’t been able to get past this issue, any thoughts would be > appreciated. > > > > We’re able to perform some operations like “CreateTable” in the driver > thread successfully. Read requests (always in the executor threads) are > always failing with: > > No valid credentials provided (Mechanism level: Failed to find any > Kerberos tgt)] > > > > Logs and scala are attached, the names of the innocent have masked for > their protection (in a consistent manner). > > > > Executing the following spark job (using HDP 2.2, Spark 1.2.0, HBase > 0.98.4, Kerberos on AD): > > export > SPARK_CLASSPATH=/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-server.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-protocol.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-hadoop2-compat.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-client.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-common.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/htrace-core-3.0.4.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/guava-12.0.1.jar:/usr/hdp/2.2.0.0-2041/hbase/conf > > > > /usr/hdp/2.2.0.0-2041/spark/bin/spark-submit --class HBaseTest > --driver-memory 2g --executor-memory 1g --executor-cores 1 --num-executors > 1 --master yarn-client ~/spark-test_2.10-1.0.jar > > > > We see this error in the executor processes (attached as yarn log.txt): > > 2015-03-18 17:34:15,121 DEBUG [Executor task launch worker-0] > security.HBaseSaslRpcClient: Creating SASL GSSAPI client. Server's Kerberos > principal name is hbase/ldevawshdp0002.<dc1>.pvc@<dc1>.PVC > > 2015-03-18 17:34:15,128 WARN [Executor task launch worker-0] > ipc.RpcClient: Exception encountered while connecting to the server : > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to > find any Kerberos tgt)] > > 2015-03-18 17:34:15,129 ERROR [Executor task launch worker-0] > ipc.RpcClient: SASL authentication failed. The most likely cause is missing > or invalid credentials. Consider 'kinit'. > > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to > find any Kerberos tgt)] > > > > The HBase Master Logs show success: > > 2015-03-18 17:34:12,861 DEBUG [RpcServer.listener,port=60000] > ipc.RpcServer: RpcServer.listener,port=60000: connection from > 10.4.0.6:46636; # active connections: 3 > > 2015-03-18 17:34:12,872 DEBUG [RpcServer.reader=3,port=60000] > ipc.RpcServer: Kerberos principal name is hbase/ldevawshdp0001.<dc1>.pvc@ > <DC1>.PVC > > 2015-03-18 17:34:12,875 DEBUG [RpcServer.reader=3,port=60000] > ipc.RpcServer: Created SASL server with mechanism = GSSAPI > > 2015-03-18 17:34:12,875 DEBUG [RpcServer.reader=3,port=60000] > ipc.RpcServer: Have read input token of size 1501 for processing by > saslServer.evaluateResponse() > > 2015-03-18 17:34:12,876 DEBUG [RpcServer.reader=3,port=60000] > ipc.RpcServer: Will send token of size 108 from saslServer. > > 2015-03-18 17:34:12,877 DEBUG [RpcServer.reader=3,port=60000] > ipc.RpcServer: Have read input token of size 0 for processing by > saslServer.evaluateResponse() > > 2015-03-18 17:34:12,878 DEBUG [RpcServer.reader=3,port=60000] > ipc.RpcServer: Will send token of size 32 from saslServer. > > 2015-03-18 17:34:12,878 DEBUG [RpcServer.reader=3,port=60000] > ipc.RpcServer: Have read input token of size 32 for processing by > saslServer.evaluateResponse() > > 2015-03-18 17:34:12,879 DEBUG [RpcServer.reader=3,port=60000] > security.HBaseSaslRpcServer: SASL server GSSAPI callback: setting > canonicalized client ID: <user1>@<DC1>.PVC > > 2015-03-18 17:34:12,895 DEBUG [RpcServer.reader=3,port=60000] > ipc.RpcServer: SASL server context established. Authenticated client: > <user1>@<DC1>.PVC (auth:SIMPLE). Negotiated QoP is auth > > 2015-03-18 17:34:29,313 DEBUG [RpcServer.reader=3,port=60000] > ipc.RpcServer: RpcServer.listener,port=60000: DISCONNECTING client > 10.4.0.6:46636 because read count=-1. Number of active connections: 3 > > 2015-03-18 17:34:37,102 DEBUG [RpcServer.listener,port=60000] > ipc.RpcServer: RpcServer.listener,port=60000: connection from > 10.4.0.6:46733; # active connections: 3 > > 2015-03-18 17:34:37,102 DEBUG [RpcServer.reader=4,port=60000] > ipc.RpcServer: RpcServer.listener,port=60000: DISCONNECTING client > 10.4.0.6:46733 because read count=-1. Number of active connections: 3 > > > > The Spark Driver Console Output hangs at this point: > > 2015-03-18 17:34:13,337 INFO [main] spark.DefaultExecutionContext: > Starting job: count at HBaseTest.scala:63 > > 2015-03-18 17:34:13,349 INFO > [sparkDriver-akka.actor.default-dispatcher-4] scheduler.DAGScheduler: Got > job 0 (count at HBaseTest.scala:63) with 1 output partitions > (allowLocal=false) > > 2015-03-18 17:34:13,350 INFO > [sparkDriver-akka.actor.default-dispatcher-4] scheduler.DAGScheduler: Final > stage: Stage 0(count at HBaseTest.scala:63) > > 2015-03-18 17:34:13,350 INFO > [sparkDriver-akka.actor.default-dispatcher-4] scheduler.DAGScheduler: > Parents of final stage: List() > > 2015-03-18 17:34:13,355 INFO > [sparkDriver-akka.actor.default-dispatcher-4] scheduler.DAGScheduler: > Missing parents: List() > > 2015-03-18 17:34:13,362 INFO > [sparkDriver-akka.actor.default-dispatcher-4] scheduler.DAGScheduler: > Submitting Stage 0 (NewHadoopRDD[0] at newAPIHadoopRDD at > HBaseTest.scala:59), which has no missing parents > > 2015-03-18 17:34:13,377 INFO > [sparkDriver-akka.actor.default-dispatcher-4] storage.MemoryStore: > ensureFreeSpace(1680) called with curMem=394981, maxMem=1111794647 > > 2015-03-18 17:34:13,378 INFO > [sparkDriver-akka.actor.default-dispatcher-4] storage.MemoryStore: Block > broadcast_1 stored as values in memory (estimated size 1680.0 B, free > 1059.9 MB) > > 2015-03-18 17:34:13,380 INFO > [sparkDriver-akka.actor.default-dispatcher-4] storage.MemoryStore: > ensureFreeSpace(1165) called with curMem=396661, maxMem=1111794647 > > 2015-03-18 17:34:13,381 INFO > [sparkDriver-akka.actor.default-dispatcher-4] storage.MemoryStore: Block > broadcast_1_piece0 stored as bytes in memory (estimated size 1165.0 B, free > 1059.9 MB) > > 2015-03-18 17:34:13,382 INFO > [sparkDriver-akka.actor.default-dispatcher-2] storage.BlockManagerInfo: > Added broadcast_1_piece0 in memory on ldevawshdp0001.<dc1>.pvc:36023 (size: > 1165.0 B, free: 1060.2 MB) > > 2015-03-18 17:34:13,384 INFO > [sparkDriver-akka.actor.default-dispatcher-4] storage.BlockManagerMaster: > Updated info of block broadcast_1_piece0 > > 2015-03-18 17:34:13,392 INFO > [sparkDriver-akka.actor.default-dispatcher-4] > spark.DefaultExecutionContext: Created broadcast 1 from broadcast at > DAGScheduler.scala:838 > > 2015-03-18 17:34:13,408 INFO > [sparkDriver-akka.actor.default-dispatcher-4] scheduler.DAGScheduler: > Submitting 1 missing tasks from Stage 0 (NewHadoopRDD[0] at newAPIHadoopRDD > at HBaseTest.scala:59) > > 2015-03-18 17:34:13,409 INFO > [sparkDriver-akka.actor.default-dispatcher-4] > cluster.YarnClientClusterScheduler: Adding task set 0.0 with 1 tasks > > 2015-03-18 17:34:13,416 INFO > [sparkDriver-akka.actor.default-dispatcher-4] util.RackResolver: Resolved > ldevawshdp0002.<dc1>.pvc to /default-rack > > 2015-03-18 17:34:13,428 INFO > [sparkDriver-akka.actor.default-dispatcher-2] scheduler.TaskSetManager: > Starting task 0.0 in stage 0.0 (TID 0, ldevawshdp0001.<dc1>.pvc, > RACK_LOCAL, 1384 bytes) > > 2015-03-18 17:34:13,846 INFO > [sparkDriver-akka.actor.default-dispatcher-2] storage.BlockManagerInfo: > Added broadcast_1_piece0 in memory on ldevawshdp0001.<dc1>.pvc:50375 (size: > 1165.0 B, free: 530.3 MB) > > 2015-03-18 17:34:13,993 INFO > [sparkDriver-akka.actor.default-dispatcher-4] storage.BlockManagerInfo: > Added broadcast_0_piece0 in memory on ldevawshdp0001.<dc1>.pvc:50375 (size: > 53.2 KB, free: 530.2 MB) > > > > *Eric Walk*, Sr. Technical Consultant > p: 617.855.9255 | NASDAQ: PRFT | *Perficient.com > <http://www.perficient.com/>* > > *[image: cid:image002.jpg@01CE554B.9F51CF10]* <http://www.perficient.com/> > > *[image: cid:image003.gif@01CE554B.9F51CF10]* > > *[image: cid:image004.jpg@01CE554B.9F51CF10]* > <http://www.linkedin.com/groups/Perficient-Inc-78909?home=&gid=78909&trk=anet_ug_hm>*[image: > cid:image005.jpg@01CE554B.9F51CF10]* > <http://www.twitter.com/perficient>*[image: > cid:image006.jpg@01CE554B.9F51CF10]* > <http://www.perficient.com/Thought-Leadership/Social-Media/Blogs>*[image: > cid:image007.jpg@01CE554B.9F51CF10]* > <http://www.facebook.com/perficient>*[image: > cid:image008.jpg@01CE554B.9F51CF10]* <http://www.youtube.com/perficient> > > *[image: cid:image009.jpg@01CE554B.9F51CF10]* > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org >