[ https://issues.apache.org/jira/browse/HIVE-9277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14337137#comment-14337137 ]
Sergey Shelukhin commented on HIVE-9277: ---------------------------------------- Also high level note: I still see stuff like " // TODO this info can be more accurate when memory mgmt is available"; how does this patch function without memory management? > Hybrid Hybrid Grace Hash Join > ----------------------------- > > Key: HIVE-9277 > URL: https://issues.apache.org/jira/browse/HIVE-9277 > Project: Hive > Issue Type: New Feature > Components: Physical Optimizer > Reporter: Wei Zheng > Assignee: Wei Zheng > Labels: join > Attachments: HIVE-9277.01.patch, HIVE-9277.02.patch, > HIVE-9277.03.patch, High-leveldesignforHybridHybridGraceHashJoinv1.0.pdf > > > We are proposing an enhanced hash join algorithm called _“hybrid hybrid grace > hash join”_. > We can benefit from this feature as illustrated below: > * The query will not fail even if the estimated memory requirement is > slightly wrong > * Expensive garbage collection overhead can be avoided when hash table grows > * Join execution using a Map join operator even though the small table > doesn't fit in memory as spilling some data from the build and probe sides > will still be cheaper than having to shuffle the large fact table > The design was based on Hadoop’s parallel processing capability and > significant amount of memory available. -- This message was sent by Atlassian JIRA (v6.3.4#6332)