More efficient age-off of old data during major compaction
----------------------------------------------------------

                 Key: HBASE-4717
                 URL: https://issues.apache.org/jira/browse/HBASE-4717
             Project: HBase
          Issue Type: Improvement
          Components: regionserver
    Affects Versions: 0.94.0
            Reporter: Todd Lipcon


Many applications need to implement efficient age-off of old data. We currently 
only perform age-off during major compaction by scanning through all of the 
KVs. Instead, we could implement the following:
- Set hbase.hstore.compaction.max.size reasonably small. Thus, older store 
files contain only smaller finite ranges of time.
- Periodically run an "age-off compaction". This compaction would scan the 
current list of storefiles. Any store file that falls entirely out of the TTL 
time range would be dropped. Store files completely within the time range would 
be un-altered. Those crossing the time-range boundary could either be left 
alone or compacted using the existing compaction code.

I don't have a design in mind for how exactly this would be implemented, but 
hope to generate some discussion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to