Hive runs out of memory with a large number of partitions
---------------------------------------------------------
Key: HIVE-2575
URL: https://issues.apache.org/jira/browse/HIVE-2575
Project: Hive
Issue Type: Bug
Reporter: Jonathan Chang
When a large number of partitions needs to be fetched for a query (say ~10k),
it will take several minutes for the query plan to even be generated and the
client will often run out of memory.
Some quick investigation shows that the partition pruner is relatively speedy,
but the actual fetch of the partitions is quite slow with most of the time
being spent in DataNucleus generated code. It also looks like the amount of
data that needs to be pulled and stored for each Partition object is quite
large.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira