RE: Query on Spark Hive with kerberos Enabled on Kubernetes
Hi Sandeep, Any inputs on this? Regards Surya From: Garlapati, Suryanarayana (Nokia - IN/Bangalore) Sent: Saturday, July 21, 2018 6:50 PM To: Sandeep Katta Cc: d...@spark.apache.org; user@spark.apache.org Subject: RE: Query on Spark Hive with kerberos Enabled on Kubernetes Hi Sandeep, Thx for the response: I am using following commands: (xml files hive-site.xml, core-site.xml and hdfs-site.xml are made available by exporting through the HADOOP_CONF_DIR option). For HDFS Access which succeeds: ./spark-submit --deploy-mode cluster --master k8s://https://k8s-apiserver.bcmt.cluster.local:8443 --kubernetes-namespace default --conf spark.kubernetes.kerberos.enabled=true --conf spark.kubernetes.kerberos.principal= --conf spark.kubernetes.kerberos.keytab= --conf spark.kubernetes.driver.docker.image= --conf spark.kubernetes.executor.docker.image= --conf spark.kubernetes.initcontainer.docker.image= --conf spark.kubernetes.resourceStagingServer.uri=http://:1 ../examples/src/main/python/wordcount.py hdfs://:8020/tmp/wordcount.txt For Hive Access (this is failing): ./spark-submit --deploy-mode cluster --master k8s://https://k8s-apiserver.bcmt.cluster.local:8443 --kubernetes-namespace default --conf spark.kubernetes.kerberos.enabled=true --files /etc/krb5.conf, ,../examples/src/main/resources/kv1.txt --conf spark.kubernetes.kerberos.principal= --conf spark.kubernetes.kerberos.keytab= --conf spark.kubernetes.driver.docker.image= --conf spark.kubernetes.executor.docker.image= --conf spark.kubernetes.initcontainer.docker.image= --conf spark.kubernetes.resourceStagingServer.uri=http://:1 ../examples/src/main/python/sql/hive.py Following is the error: 2018-07-19 04:15:55 INFO HiveUtils:54 - Initializing HiveMetastoreConnection version 1.2.1 using Spark classes. 2018-07-19 04:15:56 INFO metastore:376 - Trying to connect to metastore with URI thrift://vm-10-75-145-54:9083 2018-07-19 04:15:56 ERROR TSaslTransport:315 - SASL negotiation failure javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) If I don’t provide the krb5.conf in the above spark-submit: I get an error saying unable to find any default realm. One work around I had found, if I generate any tgt by doing the kinit and copy it into the driver pod into location /tmp/krb5cc_0, it works fine. I guess this should not be the way to do it. It should generate automatically the tgt and should access the hive metastore. Please let me know, if doing wrong. Regards Surya From: Sandeep Katta [mailto:sandeep0102.opensou...@gmail.com] Sent: Friday, July 20, 2018 9:59 PM To: Garlapati, Suryanarayana (Nokia - IN/Bangalore) mailto:suryanarayana.garlap...@nokia.com>> Cc: d...@spark.apache.org<mailto:d...@spark.apache.org>; user@spark.apache.org<mailto:user@spark.apache.org> Subject: Re: Query on Spark Hive with kerberos Enabled on Kubernetes Can you please tell us what exception you ve got,any logs for the same ? On Fri, 20 Jul 2018 at 8:36 PM, Garlapati, Suryanarayana (Nokia - IN/Bangalore) mailto:suryanarayana.garlap...@nokia.com>> wrote: Hi All, I am trying to use Spark 2.2.0 Kubernetes(https://github.com/apache-spark-on-k8s/spark/tree/v2.2.0-kubernetes-0.5.0) code to run the Hive Query on Kerberos Enabled cluster. Spark-submit’s fail for the Hive Queries, but pass when I am trying to access the hdfs. Is this a known limitation or am I doing something wrong. Please let me know. If this is working, can you please specify an example for running Hive Queries? Thanks. Regards Surya
RE: Query on Spark Hive with kerberos Enabled on Kubernetes
Hi Sandeep, Thx for the response: I am using following commands: (xml files hive-site.xml, core-site.xml and hdfs-site.xml are made available by exporting through the HADOOP_CONF_DIR option). For HDFS Access which succeeds: ./spark-submit --deploy-mode cluster --master k8s://https://k8s-apiserver.bcmt.cluster.local:8443 --kubernetes-namespace default --conf spark.kubernetes.kerberos.enabled=true --conf spark.kubernetes.kerberos.principal= --conf spark.kubernetes.kerberos.keytab= --conf spark.kubernetes.driver.docker.image= --conf spark.kubernetes.executor.docker.image= --conf spark.kubernetes.initcontainer.docker.image= --conf spark.kubernetes.resourceStagingServer.uri=http://:1 ../examples/src/main/python/wordcount.py hdfs://:8020/tmp/wordcount.txt For Hive Access (this is failing): ./spark-submit --deploy-mode cluster --master k8s://https://k8s-apiserver.bcmt.cluster.local:8443 --kubernetes-namespace default --conf spark.kubernetes.kerberos.enabled=true --files /etc/krb5.conf, ,../examples/src/main/resources/kv1.txt --conf spark.kubernetes.kerberos.principal= --conf spark.kubernetes.kerberos.keytab= --conf spark.kubernetes.driver.docker.image= --conf spark.kubernetes.executor.docker.image= --conf spark.kubernetes.initcontainer.docker.image= --conf spark.kubernetes.resourceStagingServer.uri=http://:1 ../examples/src/main/python/sql/hive.py Following is the error: 2018-07-19 04:15:55 INFO HiveUtils:54 - Initializing HiveMetastoreConnection version 1.2.1 using Spark classes. 2018-07-19 04:15:56 INFO metastore:376 - Trying to connect to metastore with URI thrift://vm-10-75-145-54:9083 2018-07-19 04:15:56 ERROR TSaslTransport:315 - SASL negotiation failure javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) If I don’t provide the krb5.conf in the above spark-submit: I get an error saying unable to find any default realm. One work around I had found, if I generate any tgt by doing the kinit and copy it into the driver pod into location /tmp/krb5cc_0, it works fine. I guess this should not be the way to do it. It should generate automatically the tgt and should access the hive metastore. Please let me know, if doing wrong. Regards Surya From: Sandeep Katta [mailto:sandeep0102.opensou...@gmail.com] Sent: Friday, July 20, 2018 9:59 PM To: Garlapati, Suryanarayana (Nokia - IN/Bangalore) Cc: d...@spark.apache.org; user@spark.apache.org Subject: Re: Query on Spark Hive with kerberos Enabled on Kubernetes Can you please tell us what exception you ve got,any logs for the same ? On Fri, 20 Jul 2018 at 8:36 PM, Garlapati, Suryanarayana (Nokia - IN/Bangalore) mailto:suryanarayana.garlap...@nokia.com>> wrote: Hi All, I am trying to use Spark 2.2.0 Kubernetes(https://github.com/apache-spark-on-k8s/spark/tree/v2.2.0-kubernetes-0.5.0) code to run the Hive Query on Kerberos Enabled cluster. Spark-submit’s fail for the Hive Queries, but pass when I am trying to access the hdfs. Is this a known limitation or am I doing something wrong. Please let me know. If this is working, can you please specify an example for running Hive Queries? Thanks. Regards Surya
Query on Spark Hive with kerberos Enabled on Kubernetes
Hi All, I am trying to use Spark 2.2.0 Kubernetes(https://github.com/apache-spark-on-k8s/spark/tree/v2.2.0-kubernetes-0.5.0) code to run the Hive Query on Kerberos Enabled cluster. Spark-submit's fail for the Hive Queries, but pass when I am trying to access the hdfs. Is this a known limitation or am I doing something wrong. Please let me know. If this is working, can you please specify an example for running Hive Queries? Thanks. Regards Surya