[
https://issues.apache.org/jira/browse/HBASE-28463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wellington Chevreuil resolved HBASE-28463.
------------------------------------------
Resolution: Fixed
This feature is now merged into master, branch-3, branch-2 and branch-2.6.
Thanks for the contributions, [~janardhan.hungund] and [~vinayakhegde], and for
the reviews, [~taklwu]!
> Time Based Priority for BucketCache
> -----------------------------------
>
> Key: HBASE-28463
> URL: https://issues.apache.org/jira/browse/HBASE-28463
> Project: HBase
> Issue Type: New Feature
> Components: BucketCache
> Reporter: Janardhan Hungund
> Assignee: Wellington Chevreuil
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2, 2.6.4
>
>
> This Jira introduces the feature of time-based priority in BucketCache, where
> a configurable "age" is used as a threshold limit for data caching. Data
> blocks with a more recent age then this limit should be kept in the cache,
> while older data would be picked for eviction (or not considered for
> caching). The data age based priority would be applied when deciding if a
> block should be added to BucketCache (i.e. during reads, writes, compaction
> and prefetch), as well as during the cache freeSpace run (mass eviction),
> before applying the LRU logic.
> Because blocks don't hold any specific meta information other than type, it's
> necessary to group blocks of same "age group" on separate files. We already
> have DateTieredCompation for that, which allows for grouping blocks according
> to its cells timestamps values in different time window groups.
> DateTieredCompaction can be configured to provide two windows (one older and
> one younger than the threshold limit), so that a cell timestamp based age
> priority can be implemented. Additionally, we are extended
> DateTieredCompaction so that the "age" value to be used for comparison can be
> provided in a pluggable way, giving extra flexibility for different use cases
> to implement their own concept of time priority.
> The current scope is to allow for data age to be determined in the following
> different ways, all configurable:
> * Cell timestamps: Uses the timestamp portion of HBase cells for comparing
> the data age, requires DateTieredCompaction to be configured to provide two
> time windows, one older and one younger than the time limit threshold.
> * Custom cell qualifiers: Uses a custom-defined qualifier for comparing the
> data age. It uses that value to tier the entire row containing the given
> qualifier value. This requires that the custom qualifier be a valid Java long
> timestamp, and must use the "new" compaction implementation defined as part
> of this feature, the CustomTieredCompaction.
> * Custom value provider: Allows for defining a pluggable implementation that
> contains the logic for identifying the date value to be used for comparison.
> This also requires the "new" compaction implementation defined as part of
> this feature, the CustomTieredCompaction.
> The initial scope proposed in 2024 was covering the cell timestamp strategy
> mentioned above and is detailed in this [design
> doc.|https://docs.google.com/document/d/1Qd3kvZodBDxHTFCIRtoePgMbvyuUSxeydi2SEWQFQro/edit?tab=t.0#heading=h.gjdgxs]
> The second phase including the two custom strategies mentioned above is
> detailed in [this separate design
> doc.|https://docs.google.com/document/d/1uBGIO9IQ-FbSrE5dnUMRtQS23NbCbAmRVDkAOADcU_E/edit?usp=sharing]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)