GroupByOperator sometimes throws OutOfMemory error when there are too many
distinct keys
----------------------------------------------------------------------------------------
Key: HIVE-1139
URL: https://issues.apache.org/jira/browse/HIVE-1139
Project: Hadoop Hive
Issue Type: Bug
Reporter: Ning Zhang
Assignee: Ning Zhang
When a partial aggregation performed on a mapper, a HashMap is created to keep
all distinct keys in main memory. This could leads to OOM exception when there
are too many distinct keys for a particular mapper. A workaround is to set the
map split size smaller so that each mapper takes less number of rows. A better
solution is to use the persistent HashMapWrapper (currently used in
CommonJoinOperator) to spill overflow rows to disk.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.