[jira] [Commented] (HIVE-721) Integration with HadoopDB

Lars Francke (JIRA) Fri, 12 Sep 2014 05:01:00 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14131455#comment-14131455
 ]


Lars Francke commented on HIVE-721:
-----------------------------------

There's not much development on HadoopDB and there's Tez and Spark now. Do you 
plan to work on this? Otherwise I suggest closing it.

> Integration with HadoopDB
> -------------------------
>
>                 Key: HIVE-721
>                 URL: https://issues.apache.org/jira/browse/HIVE-721
>             Project: Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.4.0
>            Reporter: Azza Abouzeid
>            Priority: Minor
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> The HadoopDB project integrates Hadoop with single node databases, which 
> provide a high performance data layer for analytical queries over structured 
> data. HadoopDB's SMS (SQL-to-MapReduce-to-SQL) component uses Hive's 
> SemanticAnalyzer to convert SQL to MapReduce plans. After plan generation, we 
> recreate SQL from the lower plan operators and push the SQL into database 
> layer maintaining the upper layers of the plan, that can't be pushed into the 
> single node databases, intact. For more information on this process, please 
> read the HadoopDB paper (http://db.cs.yale.edu/hadoopdb/hadoopdb.pdf) and 
> browse the source code if you feel like it (more specifically the 
> SQLQueryGenerator class) at http://sourceforge.net/projects/hadoopdb/. 
> HadoopDB is a natural system level extension of Hive's goal of providing a 
> simple SQL interface for large-scale data processing.
> A simple patch that integrates Hive with HadoopDB's SMS could be found here: 
> http://hadoopdb.svn.sourceforge.net/viewvc/hadoopdb/trunk/Patches/hive-sms.patch?view=log
> In addition to the semantic analyzer post-processing, we modified certain 
> areas to allow paths to be associated with databases to allow the recreation 
> of the operator tree from the map.input.file configuration. Instead of 
> FileInputSplit --- we set up an interface Pathable, to allow any inputsplit 
> that implements pathable to return a dummy path equivalent to the 
> map.input.file path.
> Instead of the post semantic analysis function call to the SQLQueryGenerator 
> class, you could also use hooks. One such suggestion provided by a HadoopDB 
> user is found here 
> http://sourceforge.net/tracker/index.php?func=detail&aid=2829253&group_id=269559&atid=1146689.
> We would really appreciate your help in better integrating Hive and HadoopDB. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-721) Integration with HadoopDB

Reply via email to