[ 
https://issues.apache.org/jira/browse/HBASE-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809496#comment-13809496
 ] 

Andrew Purtell commented on HBASE-9045:
---------------------------------------

bq. do we always need to compress tag by tag or sometimes the entire tag part 
can be compressed.  In some cases compressing the entire thing would be simple 
and would be better for that scneario I feel. That would imapct the WAL 
compresssion also then.

Interesting point. For the case where there's just one tag on the cell, it's 
the same, and for cases where there are a number of cells with the exact same 
set of tags it would perform better. On the other hand, if cells have many 
common tags but the similarities don't coincide on any given cell then the 
dictionary will be inefficient compared to the per-tag approach. Probably the 
per-tag approach is better for the general case.

> Support Dictionary based Tag compression in HFiles
> --------------------------------------------------
>
>                 Key: HBASE-9045
>                 URL: https://issues.apache.org/jira/browse/HBASE-9045
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 0.98.0
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>             Fix For: 0.98.0
>
>         Attachments: HBASE-9045.patch, HBASE-9045_V2.patch
>
>
> Along with the DataBlockEncoding algorithms, Dictionary based Tag compression 
> can be done



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to