[
https://issues.apache.org/jira/browse/HBASE-20045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16399810#comment-16399810
]
Zach York commented on HBASE-20045:
-----------------------------------
I have seen some interest in adding compacted blocks to the bucketcache when
cache on write is enabled. Otherwise the read performance can get very bad
after compactions. See [1] for where this is enabled.
If size is a concern, could we have a reference block inserted instead? As far
as I understand, the data won't be changing with compactions, only the cache
key (which depends on file name). However, a reference would incur an
additional read from the block cache (or bucketcache).
Sorry to jump in so late in the conversation!
[1]
[https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java#L1080]
> When running compaction, cache recent blocks.
> ---------------------------------------------
>
> Key: HBASE-20045
> URL: https://issues.apache.org/jira/browse/HBASE-20045
> Project: HBase
> Issue Type: New Feature
> Components: BlockCache, Compaction
> Affects Versions: 2.0.0-beta-1
> Reporter: Jean-Marc Spaggiari
> Priority: Major
>
> HBase already allows to cache blocks on flush. This is very useful for
> usecases where most queries are against recent data. However, as soon as
> their is a compaction, those blocks are evicted. It will be interesting to
> have a table level parameter to say "When compacting, cache blocks less than
> 24 hours old". That way, when running compaction, all blocks where some data
> are less than 24h hold, will be automatically cached.
>
> Very useful for table design where there is TS in the key but a long history
> (Like a year of sensor data).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)