Build spark without hive issue, spark-sql doesn't work.
I want to build hive and spark to make my hive based on spark engine. I choose Hive 2.3.0 and Spark 2.0.0, which is claimed to be compatible by hive official document. According to the hive officials document ,I have to build spark without hive profile to avoid the conflict between original hive and spark-integrated hive. Yes, I build successfully , but then the problem comes:I cannot use spark-sql anymore because spark-sql relies on the hive library and my spark is a no-hive build. [appuser@ab-10-11-22-209 spark]$ spark-sql java.lang.ClassNotFoundException: org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.util.Utils$.classForName(Utils.scala:225) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:686) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Failed to load main class org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver. You need to build Spark with -Phive and -Phive-thriftserver. How can I build and setup spark and to make hive on spark Work properly and my spark-sql、pyspark and spark-shell work properly? I don’t know the relationship between spark-integrated hive and original hive. Below is the spark-integrated hive jars: hive-beeline-1.2.1.spark2.jar hive-cli-1.2.1.spark2.jar hive-exec-1.2.1.spark2.jar hive-jdbc-1.2.1.spark2.jar hive-metastore-1.2.1.spark2.jar spark-hive_2.11-2.0.0.jar spark-hive-thriftserver_2.11-2.0.0.jar It seems that Spark 2.0.0 relies on hive 1.2.1.
Build spark without hive issue, spark-sql doesn't work.
I want to build hive and spark to make my hive work on spark engine. I choose Hive 2.3.0 and Spark 2.0.0, which is claimed to be compatible by hive official document. According to the hive officials document ,I have to build spark without hive profile to avoid the conflict between original hive and spark-integrated hive. Yes, I build successfully , but then the problem comes:I cannot use spark-sql anymore because spark-sql relies on the hive library and my spark is a no-hive build. I don’t know the relationship between hive-integrated hive and original hive. Below is the spark-integrated hive jars: hive-beeline-1.2.1.spark2.jar hive-cli-1.2.1.spark2.jar hive-exec-1.2.1.spark2.jar hive-jdbc-1.2.1.spark2.jar hive-metastore-1.2.1.spark2.jar spark-hive_2.11-2.0.0.jar spark-hive-thriftserver_2.11-2.0.0.jar It seems that Spark 2.0.0 relies on hive 1.2.1. How can I build and setup spark and to make hive on spark Work properly and my spark-sql、pyspark and spark-shell work properly?
difference between spark-integrated hive and original hive
I want to build hive and spark to make my hive based on spark engine. I choose Hive 2.3.0 and Spark 2.0.0, which is claimed to be compatible by hive official document. According to the hive officials document ,I have to build spark without hive profile to avoid the conflict between original hive and spark-integrated hive. Yes, I build successfully , but then the problem comes:I cannot use spark-sql anymore because spark-sql relies on the hive library and my spark is a no-hive build. I don’t know the relationship between hive-integrated hive and original hive. Below is the spark-integrated hive jars: hive-beeline-1.2.1.spark2.jar hive-cli-1.2.1.spark2.jar hive-exec-1.2.1.spark2.jar hive-jdbc-1.2.1.spark2.jar hive-metastore-1.2.1.spark2.jar spark-hive_2.11-2.0.0.jar spark-hive-thriftserver_2.11-2.0.0.jar It seems that Spark 2.0.0 relies on hive 1.2.1。 Can I just add my 2.3.0 hive's libs to the classpath of Spark?