Hi, Yes. I am able to connect to Hive from simple Java program running in the cluster. When using spark-submit I faced the issue. The output of command is given below
$> netstat -alnp |grep 10001 (Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.) tcp 1 0 53.244.194.223:25612 53.244.194.221:10001 CLOSE_WAIT - Thanks Anupama From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com] Sent: Saturday, September 17, 2016 12:36 AM To: Gangadhar, Anupama (623) Cc: user @spark Subject: Re: Error trying to connect to Hive from Spark (Yarn-Cluster Mode) Is your Hive Thrift Server up and running on port jdbc:hive2://<hiveserver2 ADDRESS>10001? Do the following netstat -alnp |grep 10001 and see whether it is actually running HTH Dr Mich Talebzadeh LinkedIn https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw http://talebzadehmich.wordpress.com Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On 16 September 2016 at 19:53, <anupama.gangad...@daimler.com<mailto:anupama.gangad...@daimler.com>> wrote: Hi, I am trying to connect to Hive from Spark application in Kerborized cluster and get the following exception. Spark version is 1.4.1 and Hive is 1.2.1. Outside of spark the connection goes through fine. Am I missing any configuration parameters? ava.sql.SQLException: Could not open connection to jdbc:hive2://<hiveserver2 ADDRESS>10001/default;principal=hive/<hive server2 host>;ssl=false;transportMode=http;httpPath=cliservice: null at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:206) at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:178) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:215) at SparkHiveJDBCTest$1.call(SparkHiveJDBCTest.java:124) at SparkHiveJDBCTest$1.call(SparkHiveJDBCTest.java:1) at org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1.apply(JavaPairRDD.scala:1027) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$6.apply$mcV$sp(PairRDDFunctions.scala:1109) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$6.apply(PairRDDFunctions.scala:1108) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$6.apply(PairRDDFunctions.scala:1108) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1285) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1116) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1095) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:182) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:258) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:203) ... 21 more In spark conf directory hive-site.xml has the following properties <configuration> <property> <name>hive.metastore.kerberos.keytab.file</name> <value>/etc/security/keytabs/hive.service.keytab</value> </property> <property> <name>hive.metastore.kerberos.principal</name> <value>hive/_HOST@<DOMAIN></value> </property> <property> <name>hive.metastore.sasl.enabled</name> <value>true</value> </property> <property> <name>hive.metastore.uris</name> <value>thrift://<Hiveserver2 address>:9083</value> </property> <property> <name>hive.server2.authentication</name> <value>KERBEROS</value> </property> <property> <name>hive.server2.authentication.kerberos.keytab</name> <value>/etc/security/keytabs/hive.service.keytab</value> </property> <property> <name>hive.server2.authentication.kerberos.principal</name> <value>hive/_HOST@<DOMAIN></value> </property> <property> <name>hive.server2.authentication.spnego.keytab</name> <value>/etc/security/keytabs/spnego.service.keytab</value> </property> <property> <name>hive.server2.authentication.spnego.principal</name> <value>HTTP/_HOST@<DOMAIN></value> </property> </configuration> --Thank you If you are not the addressee, please inform us immediately that you have received this e-mail by mistake, and delete it. We thank you for your support. If you are not the addressee, please inform us immediately that you have received this e-mail by mistake, and delete it. We thank you for your support.