[ https://issues.apache.org/jira/browse/HIVE-721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740139#action_12740139 ]
Azza Abouzeid commented on HIVE-721: ------------------------------------ BTW: we checked out code from the trunk around July 10th 2009. > Integration with HadoopDB > ------------------------- > > Key: HIVE-721 > URL: https://issues.apache.org/jira/browse/HIVE-721 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor > Affects Versions: 0.4.0 > Reporter: Azza Abouzeid > Priority: Minor > Fix For: 0.4.0 > > Original Estimate: 2h > Remaining Estimate: 2h > > The HadoopDB project integrates Hadoop with single node databases, which > provide a high performance data layer for analytical queries over structured > data. HadoopDB's SMS (SQL-to-MapReduce-to-SQL) component uses Hive's > SemanticAnalyzer to convert SQL to MapReduce plans. After plan generation, we > recreate SQL from the lower plan operators and push the SQL into database > layer maintaining the upper layers of the plan, that can't be pushed into the > single node databases, intact. For more information on this process, please > read the HadoopDB paper (http://db.cs.yale.edu/hadoopdb/hadoopdb.pdf) and > browse the source code if you feel like it (more specifically the > SQLQueryGenerator class) at http://sourceforge.net/projects/hadoopdb/. > HadoopDB is a natural system level extension of Hive's goal of providing a > simple SQL interface for large-scale data processing. > A simple patch that integrates Hive with HadoopDB's SMS could be found here: > http://hadoopdb.svn.sourceforge.net/viewvc/hadoopdb/trunk/Patches/hive-sms.patch?view=log > In addition to the semantic analyzer post-processing, we modified certain > areas to allow paths to be associated with databases to allow the recreation > of the operator tree from the map.input.file configuration. Instead of > FileInputSplit --- we set up an interface Pathable, to allow any inputsplit > that implements pathable to return a dummy path equivalent to the > map.input.file path. > Instead of the post semantic analysis function call to the SQLQueryGenerator > class, you could also use hooks. One such suggestion provided by a HadoopDB > user is found here > http://sourceforge.net/tracker/index.php?func=detail&aid=2829253&group_id=269559&atid=1146689. > We would really appreciate your help in better integrating Hive and HadoopDB. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.