[ 
https://issues.apache.org/jira/browse/HBASE-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Kyle Purtell closed HBASE-3745.
--------------------------------------

> Add the ability to restrict major-compactible files by timestamp
> ----------------------------------------------------------------
>
>                 Key: HBASE-3745
>                 URL: https://issues.apache.org/jira/browse/HBASE-3745
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Priority: Major
>
> In some applications, a common access pattern is to frequently scan tables 
> with a time range predicate restricted to a fairly recent time window. For 
> example, you may want to do an incremental aggregation or indexing step only 
> on rows that have changed in the last hour. We do this efficiently by 
> tracking min and max timestamp on an HFile level, so that old HFiles don't 
> have to be read.
> After a major compaction, however, the entire dataset will need to be read, 
> which can hurt performance of this access pattern.
> We should add a column family attribute that can specify a policy like: When 
> major compacting, never include an HFile that contains data with a timestamp 
> in the last 4 hours. This, recently flushed HFiles will always be uncompacted 
> and provide the good scan performance required for these applications.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to