[ 
https://issues.apache.org/jira/browse/HIVE-3153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-3153:
--------------------------------

    Attachment: hive-3153.patch

This patch:
 * Fixes some javadoc
 * Suppresses some unused warnings
 * I deprecated some of the unused public functions that don't seem to be 
important parts of the API.
 * Reduces the memory footprint of the Writer to just the array of 
ColumnBuffers.

With this patch, I'm able to write to use many more parallel writers in the 
same memory footprint.
                
> Release codecs and output streams between flushes of RCFile
> -----------------------------------------------------------
>
>                 Key: HIVE-3153
>                 URL: https://issues.apache.org/jira/browse/HIVE-3153
>             Project: Hive
>          Issue Type: Improvement
>          Components: Compression
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>         Attachments: hive-3153.patch
>
>
> Currently, RCFile writer holds a compression codec per a file and a 
> compression output stream per a column. Especially for queries that use 
> dynamic partitions this quickly consumes a lot of memory.
> I'd like flushRecords to get a codec from the pool and create the compression 
> output stream in flushRecords.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to