Re: Spark + HBase + Kerberos

2015-03-18 Thread Eric Walk
Hi Ted,

The spark executors and hbase regions/masters are all collocated. This is a 2 
node test environment.

Best,
Eric

Eric Walk, Sr. Technical Consultant
p: 617.855.9255 |  NASDAQ: PRFT  |  Perficient.comhttp://www.perficient.com/








From: Ted Yu yuzhih...@gmail.com
Sent: Mar 18, 2015 2:46 PM
To: Eric Walk
Cc: user@spark.apache.org;Bill Busch
Subject: Re: Spark + HBase + Kerberos

Are hbase config / keytab files deployed on executor machines ?

Consider adding -Dsun.security.krb5.debug=true for debug purpose.

Cheers

On Wed, Mar 18, 2015 at 11:39 AM, Eric Walk 
eric.w...@perficient.commailto:eric.w...@perficient.com wrote:
Having an issue connecting to HBase from a Spark container in a Secure Cluster. 
Haven’t been able to get past this issue, any thoughts would be appreciated.

We’re able to perform some operations like “CreateTable” in the driver thread 
successfully. Read requests (always in the executor threads) are always failing 
with:
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]

Logs and scala are attached, the names of the innocent have masked for their 
protection (in a consistent manner).

Executing the following spark job (using HDP 2.2, Spark 1.2.0, HBase 0.98.4, 
Kerberos on AD):
export 
SPARK_CLASSPATH=/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-server.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-protocol.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-hadoop2-compat.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-client.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-common.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/htrace-core-3.0.4.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/guava-12.0.1.jar:/usr/hdp/2.2.0.0-2041/hbase/conf

/usr/hdp/2.2.0.0-2041/spark/bin/spark-submit --class HBaseTest --driver-memory 
2g --executor-memory 1g --executor-cores 1 --num-executors 1 --master 
yarn-client ~/spark-test_2.10-1.0.jar

We see this error in the executor processes (attached as yarn log.txt):
2015-03-18 17:34:15,121 DEBUG [Executor task launch worker-0] 
security.HBaseSaslRpcClient: Creating SASL GSSAPI client. Server's Kerberos 
principal name is hbase/ldevawshdp0002.dc1.pvc@dc1.PVC
2015-03-18 17:34:15,128 WARN  [Executor task launch worker-0] ipc.RpcClient: 
Exception encountered while connecting to the server : 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]
2015-03-18 17:34:15,129 ERROR [Executor task launch worker-0] ipc.RpcClient: 
SASL authentication failed. The most likely cause is missing or invalid 
credentials. Consider 'kinit'.
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]

The HBase Master Logs show success:
2015-03-18 17:34:12,861 DEBUG [RpcServer.listener,port=6] ipc.RpcServer: 
RpcServer.listener,port=6: connection from 
10.4.0.6:46636http://10.4.0.6:46636; # active connections: 3
2015-03-18 17:34:12,872 DEBUG [RpcServer.reader=3,port=6] ipc.RpcServer: 
Kerberos principal name is hbase/ldevawshdp0001.dc1.pvc@DC1.PVC
2015-03-18 17:34:12,875 DEBUG [RpcServer.reader=3,port=6] ipc.RpcServer: 
Created SASL server with mechanism = GSSAPI
2015-03-18 17:34:12,875 DEBUG [RpcServer.reader=3,port=6] ipc.RpcServer: 
Have read input token of size 1501 for processing by 
saslServer.evaluateResponse()
2015-03-18 17:34:12,876 DEBUG [RpcServer.reader=3,port=6] ipc.RpcServer: 
Will send token of size 108 from saslServer.
2015-03-18 17:34:12,877 DEBUG [RpcServer.reader=3,port=6] ipc.RpcServer: 
Have read input token of size 0 for processing by saslServer.evaluateResponse()
2015-03-18 17:34:12,878 DEBUG [RpcServer.reader=3,port=6] ipc.RpcServer: 
Will send token of size 32 from saslServer.
2015-03-18 17:34:12,878 DEBUG [RpcServer.reader=3,port=6] ipc.RpcServer: 
Have read input token of size 32 for processing by saslServer.evaluateResponse()
2015-03-18 17:34:12,879 DEBUG [RpcServer.reader=3,port=6] 
security.HBaseSaslRpcServer: SASL server GSSAPI callback: setting canonicalized 
client ID: user1@DC1.PVC
2015-03-18 17:34:12,895 DEBUG [RpcServer.reader=3,port=6] ipc.RpcServer: 
SASL server context established. Authenticated client: user1@DC1.PVC 
(auth:SIMPLE). Negotiated QoP is auth
2015-03-18 17:34:29,313 DEBUG [RpcServer.reader=3,port=6] ipc.RpcServer: 
RpcServer.listener,port=6: DISCONNECTING client 
10.4.0.6:46636http://10.4.0.6:46636 because read count=-1. Number of active 
connections: 3
2015-03-18 17:34:37,102 DEBUG [RpcServer.listener,port=6] ipc.RpcServer: 
RpcServer.listener,port=6: connection from 
10.4.0.6:46733http://10.4.0.6:46733; # active connections: 3
2015-03-18 17:34:37,102 DEBUG [RpcServer.reader=4,port=6] ipc.RpcServer: 
RpcServer.listener,port=6: DISCONNECTING client 
10.4.0.6:46733http://10.4.0.6:46733 because read count=-1. Number of active 
connections: 3

The Spark Driver

Re: Spark + HBase + Kerberos

2015-03-18 Thread Ted Yu
Are hbase config / keytab files deployed on executor machines ?

Consider adding -Dsun.security.krb5.debug=true for debug purpose.

Cheers

On Wed, Mar 18, 2015 at 11:39 AM, Eric Walk eric.w...@perficient.com
wrote:

  Having an issue connecting to HBase from a Spark container in a Secure
 Cluster. Haven’t been able to get past this issue, any thoughts would be
 appreciated.



 We’re able to perform some operations like “CreateTable” in the driver
 thread successfully. Read requests (always in the executor threads) are
 always failing with:

 No valid credentials provided (Mechanism level: Failed to find any
 Kerberos tgt)]



 Logs and scala are attached, the names of the innocent have masked for
 their protection (in a consistent manner).



 Executing the following spark job (using HDP 2.2, Spark 1.2.0, HBase
 0.98.4, Kerberos on AD):

 export
 SPARK_CLASSPATH=/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-server.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-protocol.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-hadoop2-compat.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-client.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-common.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/htrace-core-3.0.4.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/guava-12.0.1.jar:/usr/hdp/2.2.0.0-2041/hbase/conf



 /usr/hdp/2.2.0.0-2041/spark/bin/spark-submit --class HBaseTest
 --driver-memory 2g --executor-memory 1g --executor-cores 1 --num-executors
 1 --master yarn-client ~/spark-test_2.10-1.0.jar



 We see this error in the executor processes (attached as yarn log.txt):

 2015-03-18 17:34:15,121 DEBUG [Executor task launch worker-0]
 security.HBaseSaslRpcClient: Creating SASL GSSAPI client. Server's Kerberos
 principal name is hbase/ldevawshdp0002.dc1.pvc@dc1.PVC

 2015-03-18 17:34:15,128 WARN  [Executor task launch worker-0]
 ipc.RpcClient: Exception encountered while connecting to the server :
 javax.security.sasl.SaslException: GSS initiate failed [Caused by
 GSSException: No valid credentials provided (Mechanism level: Failed to
 find any Kerberos tgt)]

 2015-03-18 17:34:15,129 ERROR [Executor task launch worker-0]
 ipc.RpcClient: SASL authentication failed. The most likely cause is missing
 or invalid credentials. Consider 'kinit'.

 javax.security.sasl.SaslException: GSS initiate failed [Caused by
 GSSException: No valid credentials provided (Mechanism level: Failed to
 find any Kerberos tgt)]



 The HBase Master Logs show success:

 2015-03-18 17:34:12,861 DEBUG [RpcServer.listener,port=6]
 ipc.RpcServer: RpcServer.listener,port=6: connection from
 10.4.0.6:46636; # active connections: 3

 2015-03-18 17:34:12,872 DEBUG [RpcServer.reader=3,port=6]
 ipc.RpcServer: Kerberos principal name is hbase/ldevawshdp0001.dc1.pvc@
 DC1.PVC

 2015-03-18 17:34:12,875 DEBUG [RpcServer.reader=3,port=6]
 ipc.RpcServer: Created SASL server with mechanism = GSSAPI

 2015-03-18 17:34:12,875 DEBUG [RpcServer.reader=3,port=6]
 ipc.RpcServer: Have read input token of size 1501 for processing by
 saslServer.evaluateResponse()

 2015-03-18 17:34:12,876 DEBUG [RpcServer.reader=3,port=6]
 ipc.RpcServer: Will send token of size 108 from saslServer.

 2015-03-18 17:34:12,877 DEBUG [RpcServer.reader=3,port=6]
 ipc.RpcServer: Have read input token of size 0 for processing by
 saslServer.evaluateResponse()

 2015-03-18 17:34:12,878 DEBUG [RpcServer.reader=3,port=6]
 ipc.RpcServer: Will send token of size 32 from saslServer.

 2015-03-18 17:34:12,878 DEBUG [RpcServer.reader=3,port=6]
 ipc.RpcServer: Have read input token of size 32 for processing by
 saslServer.evaluateResponse()

 2015-03-18 17:34:12,879 DEBUG [RpcServer.reader=3,port=6]
 security.HBaseSaslRpcServer: SASL server GSSAPI callback: setting
 canonicalized client ID: user1@DC1.PVC

 2015-03-18 17:34:12,895 DEBUG [RpcServer.reader=3,port=6]
 ipc.RpcServer: SASL server context established. Authenticated client:
 user1@DC1.PVC (auth:SIMPLE). Negotiated QoP is auth

 2015-03-18 17:34:29,313 DEBUG [RpcServer.reader=3,port=6]
 ipc.RpcServer: RpcServer.listener,port=6: DISCONNECTING client
 10.4.0.6:46636 because read count=-1. Number of active connections: 3

 2015-03-18 17:34:37,102 DEBUG [RpcServer.listener,port=6]
 ipc.RpcServer: RpcServer.listener,port=6: connection from
 10.4.0.6:46733; # active connections: 3

 2015-03-18 17:34:37,102 DEBUG [RpcServer.reader=4,port=6]
 ipc.RpcServer: RpcServer.listener,port=6: DISCONNECTING client
 10.4.0.6:46733 because read count=-1. Number of active connections: 3



 The Spark Driver Console Output hangs at this point:

 2015-03-18 17:34:13,337 INFO  [main] spark.DefaultExecutionContext:
 Starting job: count at HBaseTest.scala:63

 2015-03-18 17:34:13,349 INFO
 [sparkDriver-akka.actor.default-dispatcher-4] scheduler.DAGScheduler: Got
 job 0 (count at HBaseTest.scala:63) with 1 output partitions
 (allowLocal=false)

 2015-03-18 17:34:13,350 INFO
 [sparkDriver-akka.actor.default-dispatcher-4] scheduler.DAGScheduler: Final