[ https://issues.apache.org/jira/browse/HIVE-2623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13206967#comment-13206967 ]
Krishna Kumar commented on HIVE-2623: ------------------------------------- I will attach cpu usage information for compression/decompression. - Note that this compression/decompression is trivial compared to the work done as part of bzip2 etc. The numbers are converted to bits one item at a time, with no lookback/sorting etc. - The numbers above are specific to a particular distribution, for which the encoding is optimal. I have a few other integer compressors coded (eliasdelta, eliasomega, Vint, Continuation Bit, Golomb/Rice) each of which will have its own sweet spot. (The last is parametric, the parameter being estimated on the block, which means that it will have several optimal distributions.) I need to do the analysis and get the numbers for these too. > Add Integer type compressors > ---------------------------- > > Key: HIVE-2623 > URL: https://issues.apache.org/jira/browse/HIVE-2623 > Project: Hive > Issue Type: Sub-task > Components: Contrib > Reporter: Krishna Kumar > Assignee: Krishna Kumar > Priority: Minor > Attachments: HIVE-2623.v0.patch, HIVE-2623.v1.patch, > HIVE-2623.v2.patch, data.tar.gz > > > Type-specific compressors for integers. > Starting with elias gamma which prefers small values as per a power-law like > distribution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira