[ https://issues.apache.org/jira/browse/HBASE-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lars Hofhansl resolved HBASE-3745. ---------------------------------- Resolution: Duplicate Let me mark this as DUP of HBASE-6371 > Add the ability to restrict major-compactible files by timestamp > ---------------------------------------------------------------- > > Key: HBASE-3745 > URL: https://issues.apache.org/jira/browse/HBASE-3745 > Project: HBase > Issue Type: Improvement > Affects Versions: 0.92.0 > Reporter: Todd Lipcon > > In some applications, a common access pattern is to frequently scan tables > with a time range predicate restricted to a fairly recent time window. For > example, you may want to do an incremental aggregation or indexing step only > on rows that have changed in the last hour. We do this efficiently by > tracking min and max timestamp on an HFile level, so that old HFiles don't > have to be read. > After a major compaction, however, the entire dataset will need to be read, > which can hurt performance of this access pattern. > We should add a column family attribute that can specify a policy like: When > major compacting, never include an HFile that contains data with a timestamp > in the last 4 hours. This, recently flushed HFiles will always be uncompacted > and provide the good scan performance required for these applications. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira