[
https://issues.apache.org/jira/browse/HIVE-721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14131455#comment-14131455
]
Lars Francke commented on HIVE-721:
-----------------------------------
There's not much development on HadoopDB and there's Tez and Spark now. Do you
plan to work on this? Otherwise I suggest closing it.
> Integration with HadoopDB
> -------------------------
>
> Key: HIVE-721
> URL: https://issues.apache.org/jira/browse/HIVE-721
> Project: Hive
> Issue Type: New Feature
> Components: Query Processor
> Affects Versions: 0.4.0
> Reporter: Azza Abouzeid
> Priority: Minor
> Original Estimate: 2h
> Remaining Estimate: 2h
>
> The HadoopDB project integrates Hadoop with single node databases, which
> provide a high performance data layer for analytical queries over structured
> data. HadoopDB's SMS (SQL-to-MapReduce-to-SQL) component uses Hive's
> SemanticAnalyzer to convert SQL to MapReduce plans. After plan generation, we
> recreate SQL from the lower plan operators and push the SQL into database
> layer maintaining the upper layers of the plan, that can't be pushed into the
> single node databases, intact. For more information on this process, please
> read the HadoopDB paper (http://db.cs.yale.edu/hadoopdb/hadoopdb.pdf) and
> browse the source code if you feel like it (more specifically the
> SQLQueryGenerator class) at http://sourceforge.net/projects/hadoopdb/.
> HadoopDB is a natural system level extension of Hive's goal of providing a
> simple SQL interface for large-scale data processing.
> A simple patch that integrates Hive with HadoopDB's SMS could be found here:
> http://hadoopdb.svn.sourceforge.net/viewvc/hadoopdb/trunk/Patches/hive-sms.patch?view=log
> In addition to the semantic analyzer post-processing, we modified certain
> areas to allow paths to be associated with databases to allow the recreation
> of the operator tree from the map.input.file configuration. Instead of
> FileInputSplit --- we set up an interface Pathable, to allow any inputsplit
> that implements pathable to return a dummy path equivalent to the
> map.input.file path.
> Instead of the post semantic analysis function call to the SQLQueryGenerator
> class, you could also use hooks. One such suggestion provided by a HadoopDB
> user is found here
> http://sourceforge.net/tracker/index.php?func=detail&aid=2829253&group_id=269559&atid=1146689.
> We would really appreciate your help in better integrating Hive and HadoopDB.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)