[ https://issues.apache.org/jira/browse/HIVE-1139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563143#comment-16563143 ]
Qiaoyi Ding edited comment on HIVE-1139 at 7/31/18 5:51 AM: ------------------------------------------------------------ Hi all, I'm new to HIVE, and i find there is no spilling when hash agg having lots of entries in the hash map which could cause OOM. So, is this issue still a open, or any update?[~aprabhakar] was (Author: dingqiaoyi): Hi all, I'm new to HIVE, and i find there is no spilling when hash agg having lots of entries in the hash map which could cause OOM. So, is this issue still a open, or any update? > GroupByOperator sometimes throws OutOfMemory error when there are too many > distinct keys > ---------------------------------------------------------------------------------------- > > Key: HIVE-1139 > URL: https://issues.apache.org/jira/browse/HIVE-1139 > Project: Hive > Issue Type: Bug > Components: Query Processor > Affects Versions: 0.5.0 > Reporter: Ning Zhang > Assignee: Arvind Prabhakar > Priority: Major > Attachments: PersistentMap.zip > > > When a partial aggregation performed on a mapper, a HashMap is created to > keep all distinct keys in main memory. This could leads to OOM exception when > there are too many distinct keys for a particular mapper. A workaround is to > set the map split size smaller so that each mapper takes less number of rows. > A better solution is to use the persistent HashMapWrapper (currently used in > CommonJoinOperator) to spill overflow rows to disk. -- This message was sent by Atlassian JIRA (v7.6.3#76005)