I'm a bit confused about the documentation in the area of Hive support.

I want to use a remote Hive metastore/hdfs server and the documentation says
that we need to build Spark from source due to the large number of
dependencies Hive requires.

Specifically the documentation says:

"Hive has a large number of dependencies, it is not included in the default
Spark assembly....This command builds a new assembly jar that includes
Hive."

So I downloaded the source distribution of Spark 1.4.1 and executed the
following build command:

./make-distribution.sh --name spark-1.4.1-hadoop-2.6-hive --tgz -Pyarn
-Phadoop-2.6 -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver  -DskipTests

Inspecting the size of the resulting spark-assembly-1.4.1-hadoop2.6.0.jar it
is only a few bytes different ie. Pre-built jar is 162976273 bytes and my
custom built jar is 162976444. I don't see any new hive jar file either?

Can someone please help me understand what is going on here?

Cheers,
Reece



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Do-I-really-need-to-build-Spark-for-Hive-Thrift-Server-support-tp24013.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to