[
https://issues.apache.org/jira/browse/HIVE-7370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xuefu Zhang updated HIVE-7370:
------------------------------
Attachment: (was: spark_1.0.0.patch)
> Initial ground work for Hive on Spark [Spark branch]
> ----------------------------------------------------
>
> Key: HIVE-7370
> URL: https://issues.apache.org/jira/browse/HIVE-7370
> Project: Hive
> Issue Type: Task
> Components: Spark
> Reporter: Xuefu Zhang
> Assignee: Xuefu Zhang
> Attachments: HIVE-7370.patch, spark_1.0.0.patch
>
>
> Contribute PoC code to Hive on Spark as the ground work for subsequent tasks.
> While it has hacks and bad organized code, it will change and more
> importantly it allows multiple people to working on different components
> concurrently.
> With this, simple queries such as "select col from tab where ..." and "select
> grp, avg(val) from tab group by grp where ..." can be executed on Spark.
> Contents of the patch:
> 1. code path for additional execution engine
> 2. essential classes such as SparkWork, SparkTask, SparkCompiler,
> HiveMapFunction, HiveReduceFunction, SparkClient, etc.
> 3. Some code changes to existing classes.
> 4. build infrastructure
> 5. utility classes.
> To try run Hive on Spark, for now you need to have:
> 1. self-built Spark 1.0.0 with the patch attached.
> 2. invoke Hive client with environment variable MASTER, which points to
> master URL of Spark.
> 2. set hive.execution.engine=spark
> 3. execute supported queries.
> NO PRECOMMIT TESTS. This is for spark branch only.
--
This message was sent by Atlassian JIRA
(v6.2#6252)