[ https://issues.apache.org/jira/browse/HBASE-20284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Anoop Sam John updated HBASE-20284: ----------------------------------- Summary: BucketCache reduce heap overhead : Investigate removal of NavigableSet 'blocksByHFile' (was: Reduce heap overhead : Investigate removal of NavigableSet 'blocksByHFile') > BucketCache reduce heap overhead : Investigate removal of NavigableSet > 'blocksByHFile' > -------------------------------------------------------------------------------------- > > Key: HBASE-20284 > URL: https://issues.apache.org/jira/browse/HBASE-20284 > Project: HBase > Issue Type: Sub-task > Reporter: Anoop Sam John > Assignee: Anoop Sam John > Priority: Major > > This Set takes 40 bytes per entry (block). As of now the total heap > requirement per entry is 160. If we can avoid this Set it is 25% reduction. > This set is used for removal of blocks for a specific HFile after its > invalidation (Mostly because of its compaction or by Store close). Check > other ways to remove the blocks. May be in an async way after the compaction > is over by a dedicated cleaner thread (?) It might be ok not to remove the > invalidated file's entries immediately. When the cache is out of space, the > Eviction thread might select it and remove. Few things to consider/change > 1. When compaction process reads blocks , it might be delivered from cache. > We should not consider this access as a real block access for this block. > That will increase the chances of eviction thread selecting this block for > removal. We should be able to distinguish the Cache read by compaction > process/user read process clearly > 2. When the compaction process reads a block from cache, some way we can mark > this block (using one byte boolean) that it is just went with the compaction? > When later the Eviction thread to select a block and if there is tie because > of same access time/count, we can break this tie in favor of selecting the > already compacted block? Need to check its pros and cons. -- This message was sent by Atlassian JIRA (v7.6.3#76005)