[ 
https://issues.apache.org/jira/browse/HIVE-2623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13206967#comment-13206967
 ] 

Krishna Kumar commented on HIVE-2623:
-------------------------------------

I will attach cpu usage information for compression/decompression.

 - Note that this compression/decompression is trivial compared to the work 
done as part of bzip2 etc. The numbers are converted to bits one  item at a 
time, with no lookback/sorting etc.
 - The numbers above are specific to a particular distribution, for which the 
encoding is optimal. I have a few other integer compressors coded (eliasdelta, 
eliasomega, Vint, Continuation Bit, Golomb/Rice) each of which will have its 
own sweet spot. (The last is parametric, the parameter being estimated on the 
block, which means that it will have several optimal distributions.) I need to 
do the analysis and get the numbers for these too.
                
> Add Integer type compressors
> ----------------------------
>
>                 Key: HIVE-2623
>                 URL: https://issues.apache.org/jira/browse/HIVE-2623
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Contrib
>            Reporter: Krishna Kumar
>            Assignee: Krishna Kumar
>            Priority: Minor
>         Attachments: HIVE-2623.v0.patch, HIVE-2623.v1.patch, 
> HIVE-2623.v2.patch, data.tar.gz
>
>
> Type-specific compressors for integers.
> Starting with elias gamma which prefers small values as per a power-law like 
> distribution. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to