[ https://issues.apache.org/jira/browse/HIVE-2097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160734#comment-13160734 ]
alex gemini commented on HIVE-2097: ----------------------------------- I'm not quit sure I explain the columnar database execution strategy clearly,I hope the following material will help: #1 http://www.infoq.com/news/2011/09/nosqlnow-columnar-databases #2 http://www.oscon.com/oscon2010/public/schedule/detail/15561 #3 http://www.vertica.com/2010/05/26/why-verticas-compression-is-better/ #4 http://www.vertica.com/2011/09/01/the-power-of-projections-part-1/ Good luck. > Explore mechanisms for better compression with RC Files > ------------------------------------------------------- > > Key: HIVE-2097 > URL: https://issues.apache.org/jira/browse/HIVE-2097 > Project: Hive > Issue Type: Improvement > Components: Query Processor, Serializers/Deserializers > Reporter: Krishna Kumar > Assignee: Krishna Kumar > Priority: Minor > > Optimization of the compression mechanisms used by RC File to be explored. > Some initial ideas > > 1. More efficient serialization/deserialization based on type-specific and > storage-specific knowledge. > > For instance, storing sorted numeric values efficiently using some delta > coding techniques > 2. More efficient compression based on type-specific and storage-specific > knowledge > Enable compression codecs to be specified based on types or individual > columns > 3. Reordering the on-disk storage for better compression efficiency. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira