Wellington Chevreuil created HBASE-29412:
--------------------------------------------

             Summary: Extend date tiered compaction to allow for tiering by 
values other than cell timestamp
                 Key: HBASE-29412
                 URL: https://issues.apache.org/jira/browse/HBASE-29412
             Project: HBase
          Issue Type: Task
          Components: Compaction
            Reporter: Wellington Chevreuil
            Assignee: Wellington Chevreuil


Extend DateTieredCompactor with a CustomTieredCompactor that uses a pluggable 
value provider for extracting the value to be used for comparison in this 
tiered compaction.

Define a built-in value provider that uses a configurable qualifier value for 
comparison in the tiered compaction. Using a qualifier value for grouping data 
may require propagating this value to all other cells within the same row key:
* We can use cell tags to append the tiering value to all other cells within a 
row. This might be needed by the multi file writer, as cells are appended 
individually, and the multi file writer must know to which tier file the 
appended cell must be forwarded.
* Finding the tiering value for each row requires going back and forward the 
cells of a given row. This is needed in order to figure out the tiering value 
before starting writing the row to new files.
* If a given row doesn't have the tiering value, just treat it as high priority 
and tag its cells to be tiered within the highest priority group.

The tiering boundary values (min/max) of each tiering group should be persisted 
to the related store file as a KV pair in the file info portion of the file.








--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to