[ https://issues.apache.org/jira/browse/HBASE-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809496#comment-13809496 ]
Andrew Purtell commented on HBASE-9045: --------------------------------------- bq. do we always need to compress tag by tag or sometimes the entire tag part can be compressed. In some cases compressing the entire thing would be simple and would be better for that scneario I feel. That would imapct the WAL compresssion also then. Interesting point. For the case where there's just one tag on the cell, it's the same, and for cases where there are a number of cells with the exact same set of tags it would perform better. On the other hand, if cells have many common tags but the similarities don't coincide on any given cell then the dictionary will be inefficient compared to the per-tag approach. Probably the per-tag approach is better for the general case. > Support Dictionary based Tag compression in HFiles > -------------------------------------------------- > > Key: HBASE-9045 > URL: https://issues.apache.org/jira/browse/HBASE-9045 > Project: HBase > Issue Type: Sub-task > Affects Versions: 0.98.0 > Reporter: Anoop Sam John > Assignee: Anoop Sam John > Fix For: 0.98.0 > > Attachments: HBASE-9045.patch, HBASE-9045_V2.patch > > > Along with the DataBlockEncoding algorithms, Dictionary based Tag compression > can be done -- This message was sent by Atlassian JIRA (v6.1#6144)