[ 
https://issues.apache.org/jira/browse/HBASE-20284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-20284:
-----------------------------------
    Summary: BucketCache reduce heap overhead : Investigate removal of 
NavigableSet 'blocksByHFile'  (was: Reduce heap overhead : Investigate removal 
of NavigableSet 'blocksByHFile')

> BucketCache reduce heap overhead : Investigate removal of NavigableSet 
> 'blocksByHFile'
> --------------------------------------------------------------------------------------
>
>                 Key: HBASE-20284
>                 URL: https://issues.apache.org/jira/browse/HBASE-20284
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>            Priority: Major
>
> This Set takes 40 bytes per entry (block).   As of now the total heap 
> requirement per entry is 160.   If we can avoid this Set it is 25% reduction. 
>   This set is used for removal of blocks for a specific HFile after its 
> invalidation (Mostly because of its compaction or by Store close).  Check 
> other ways to remove the blocks. May be in an async way after the compaction 
> is over by a dedicated cleaner thread (?) It might be ok not to remove the 
> invalidated file's entries immediately. When the cache is out of space, the 
> Eviction thread might select it and remove.  Few things to consider/change
> 1.  When compaction process reads blocks , it might be delivered from cache. 
> We should not consider this access as a real block access for this block. 
> That will increase the chances of eviction thread selecting this block for 
> removal. We should be able to distinguish the Cache read by compaction 
> process/user read process clearly
> 2. When the compaction process reads a block from cache, some way we can mark 
> this block (using one byte boolean) that it is just went with the compaction? 
>  When later the Eviction thread to select a block and if there is tie because 
> of same access time/count,  we can break this tie in favor of selecting the 
> already compacted block?  Need to check its pros and cons. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to