[ https://issues.apache.org/jira/browse/HIVE-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain updated HIVE-1158: ----------------------------- Resolution: Fixed Fix Version/s: 0.5.0 Status: Resolved (was: Patch Available) Committed in 0.5 also. Thanks Ning > Introducing a new parameter for Map-side join bucket size > --------------------------------------------------------- > > Key: HIVE-1158 > URL: https://issues.apache.org/jira/browse/HIVE-1158 > Project: Hadoop Hive > Issue Type: Improvement > Affects Versions: 0.5.0, 0.6.0 > Reporter: Ning Zhang > Assignee: Ning Zhang > Fix For: 0.5.0 > > Attachments: HIVE-1158.patch, HIVE-1158_branch_0_5.patch > > > Map-side join cache the small table in memory and join with the split of the > large table at the mapper side. If the small table is too large, it uses > RowContainer to cache a number of rows indicated by parameter > hive.join.cache.size, whose default value is 25000. This parameter is also > used for regular reducer-side joins to cache all input tables except the > streaming table. This default value is too large for map-side join bucket > size, resulting in OOM exceptions sometimes. We should define a different > parameter to separate these two cache sizes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.