[ 
https://issues.apache.org/jira/browse/HIVE-7370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-7370:
------------------------------

    Component/s: Spark

> Initial ground work for Hive on Spark [Spark branch]
> ----------------------------------------------------
>
>                 Key: HIVE-7370
>                 URL: https://issues.apache.org/jira/browse/HIVE-7370
>             Project: Hive
>          Issue Type: Task
>          Components: Spark
>            Reporter: Xuefu Zhang
>            Assignee: Xuefu Zhang
>         Attachments: HIVE-7370.patch, spark_1.0.0.patch
>
>
> Contribute PoC code to Hive on Spark as the ground work for subsequent tasks. 
> While it has hacks and bad organized code, it will change and more 
> importantly it allows multiple people to working on different components 
> concurrently.
> With this, simple queries such as "select col from tab where ..." and "select 
> grp, avg(val) from tab group by grp where ..." can be executed on Spark.
> Contents of the patch:
> 1. code path for additional execution engine
> 2. essential classes such as SparkWork, SparkTask, SparkCompiler, 
> HiveMapFunction, HiveReduceFunction, SparkClient, etc.
> 3. Some code changes to existing classes.
> 4. build infrastructure
> 5. utility classes.
> To try run Hive on Spark, for now you need to have:
> 1. self-built Spark 1.0.0 with the patch attached.
> 2. invoke Hive client with environment variable MASTER, which points to 
> master URL of Spark.
> 2. set hive.execution.engine=spark
> 3. execute supported queries.
> NO PRECOMMIT TESTS. This is for spark branch only.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to