Anoop Sam John created HBASE-20284:
--------------------------------------

             Summary: Reduce heap overhead : Investigate removal of 
NavigableSet 'blocksByHFile'
                 Key: HBASE-20284
                 URL: https://issues.apache.org/jira/browse/HBASE-20284
             Project: HBase
          Issue Type: Sub-task
            Reporter: Anoop Sam John
            Assignee: Anoop Sam John


This Set takes 40 bytes per entry (block).   As of now the total heap 
requirement per entry is 160.   If we can avoid this Set it is 25% reduction.   
This set is used for removal of blocks for a specific HFile after its 
invalidation (Mostly because of its compaction or by Store close).  Check other 
ways to remove the blocks. May be in an async way after the compaction is over 
by a dedicated cleaner thread (?) It might be ok not to remove the invalidated 
file's entries immediately. When the cache is out of space, the Eviction thread 
might select it and remove.  Few things to consider/change
1.  When compaction process reads blocks , it might be delivered from cache. We 
should not consider this access as a real block access for this block. That 
will increase the chances of eviction thread selecting this block for removal. 
We should be able to distinguish the Cache read by compaction process/user read 
process clearly
2. When the compaction process reads a block from cache, some way we can mark 
this block (using one byte boolean) that it is just went with the compaction?  
When later the Eviction thread to select a block and if there is tie because of 
same access time/count,  we can break this tie in favor of selecting the 
already compacted block?  Need to check its pros and cons. 




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to