Re: Executing hive query from Spark code

Ted Yu Mon, 02 Mar 2015 13:36:53 -0800

Here is snippet of dependency tree for spark-hive module:

[INFO] org.apache.spark:spark-hive_2.10:jar:1.3.0-SNAPSHOT
...
[INFO] +- org.spark-project.hive:hive-metastore:jar:0.13.1a:compile
[INFO] |  +- org.spark-project.hive:hive-shims:jar:0.13.1a:compile
[INFO] |  |  +-
org.spark-project.hive.shims:hive-shims-common:jar:0.13.1a:compile
[INFO] |  |  +-
org.spark-project.hive.shims:hive-shims-0.20:jar:0.13.1a:runtime
[INFO] |  |  +-
org.spark-project.hive.shims:hive-shims-common-secure:jar:0.13.1a:compile
[INFO] |  |  +-
org.spark-project.hive.shims:hive-shims-0.20S:jar:0.13.1a:runtime
[INFO] |  |  \-
org.spark-project.hive.shims:hive-shims-0.23:jar:0.13.1a:runtime
...
[INFO] +- org.spark-project.hive:hive-exec:jar:0.13.1a:compile
[INFO] |  +- org.spark-project.hive:hive-ant:jar:0.13.1a:compile
[INFO] |  |  \- org.apache.velocity:velocity:jar:1.5:compile
[INFO] |  |     \- oro:oro:jar:2.0.8:compile
[INFO] |  +- org.spark-project.hive:hive-common:jar:0.13.1a:compile
...
[INFO] +- org.spark-project.hive:hive-serde:jar:0.13.1a:compile


bq. is there a way to have the hive support without updating the assembly

I don't think so.

On Mon, Mar 2, 2015 at 12:37 PM, nitinkak001 <nitinkak...@gmail.com> wrote:

> I want to run Hive query inside Spark and use the RDDs generated from that
> inside Spark. I read in the documentation
>
> "/Hive support is enabled by adding the -Phive and -Phive-thriftserver
> flags
> to Spark’s build. This command builds a new assembly jar that includes
> Hive.
> Note that this Hive assembly jar must also be present on all of the worker
> nodes, as they will need access to the Hive serialization and
> deserialization libraries (SerDes) in order to access data stored in
> Hive./"
>
> I just wanted to know what -Phive and -Phive-thriftserver flags really do
> and is there a way to have the hive support without updating the assembly.
> Does that flag add a hive support jar or something?
>
> The reason I am asking is that I will be using Cloudera version of Spark in
> future and I am not sure how to add the Hive support to that Spark
> distribution.
>
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Executing-hive-query-from-Spark-code-tp21880.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: Executing hive query from Spark code

Reply via email to