[ https://issues.apache.org/jira/browse/HIVE-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473788#comment-13473788 ]
Namit Jain commented on HIVE-3565: ---------------------------------- Consider a query like: select B.y, count(1) from A join B on A.x=B.x group by B.y; This will require 2 MR jobs. The first MR job will perform the join, and the second MR job will perform the group by (note that the 2nd MR job would have a identity mapper). If the first MR job could write the output of the join to a HBase table (which is keyed by B.y), the 2nd MR can be a map-only job which can simply scan the HBase table. This idea can be extended to joins as well. > use hbase tables for writing intermediate directories across map-reduce > boundaries > ---------------------------------------------------------------------------------- > > Key: HIVE-3565 > URL: https://issues.apache.org/jira/browse/HIVE-3565 > Project: Hive > Issue Type: New Feature > Components: Query Processor > Reporter: Namit Jain > Assignee: Namit Jain > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira