[jira] [Updated] (HIVE-2082) Reduce memory consumption in preparing MapReduce job

Carl Steinbach (JIRA) Tue, 26 Jul 2011 17:03:35 -0700

     [ 
https://issues.apache.org/jira/browse/HIVE-2082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Carl Steinbach updated HIVE-2082:
---------------------------------

      Component/s: Query Processor
    Fix Version/s: 0.8.0

> Reduce memory consumption in preparing MapReduce job
> ----------------------------------------------------
>
>                 Key: HIVE-2082
>                 URL: https://issues.apache.org/jira/browse/HIVE-2082
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>             Fix For: 0.8.0
>
>         Attachments: HIVE-2082.patch, HIVE-2082.patch, HIVE-2082.patch
>
>
> Hive client side consume a lot of memory when the number of input partitions 
> is large. One reason is that each partition maintains a list of FieldSchema 
> which are intended to deal with schema evolution. However they are not used 
> currently and Hive uses the table level schema for all partitions. This will 
> be fixed in HIVE-2050. The memory consumption by this part will be reduced by 
> almost half (1.2GB to 700BM for 20k partitions). 
> Another large chunk of memory consumption is in the MapReduce job setup phase 
> when a PartitionDesc is created from each Partition object. A property object 
> is maintained in PartitionDesc which contains a full list of columns and 
> types. Due to the same reason, these should be the same as in the table level 
> schema. Also the deserializer initialization takes large amount of memory, 
> which should be avoided. My initial testing for these optimizations cut the 
> memory consumption in half (700MB to 300MB for 20k partitions). 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2082) Reduce memory consumption in preparing MapReduce job

Reply via email to