[ 
https://issues.apache.org/jira/browse/HBASE-17819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16237495#comment-16237495
 ] 

Anoop Sam John edited comment on HBASE-17819 at 11/3/17 11:52 AM:
------------------------------------------------------------------

To let know the approach.  This is bit diff from V2 patch.  Major changes are
1. BucketEntry is extended to make the SharedMemory BucketEntry.  For file 
mode, there is no need to keep the ref count as that is not shared memory type. 
 So I removed those new states added for 11425 from BucketEntry.  For off heap 
mode BucketEntry, we have an extension now where we have the new states.
2. Removed the CSLM for keeping the HFilename based blocks info.  The 
evictBlocksByHfileName will have a perf impact as it has to iterate through all 
the entries to know each of the block entry belong to this file or not.  For 
that changed the evictBlocksByHfileName to be an async op way. A dedicated 
eviction thread will do this work.  ANy way even if we dont remove these blocks 
or have delay in removal, eventually these block will get removed as we have 
LRU algo for the eviction.  So when there are no space left for the new blocks 
addition, eviction would happen, removing unused blocks.  More over, eviction 
of blocks on HFile close is default off only (We have a config to turn this 
off).  When it is compaction , for the compacted files, we have evictByHFiles 
happening now. There will be  bit more delay for the actual removal of the 
blocks.   
But we save lot of heap memory per entry now as per this approach. The math is 
there in above comment
{quote}
Now - 32 + 64 + 40 + 40 = 176
After patch - 32 + 48 + 40 = 120
Tested with Java Instrumentation
{quote}


was (Author: anoop.hbase):
To let know the approach.  This is bit diff from V2 patch.  Major changes are
1. BucketEntry is extended to make the SharedMemory BucketEntry.  For file 
mode, there is no need to keep the ref count as that is not shared memory type. 
 So I removed those new states added for 11425 from BucketEntry.  For off heap 
mode BucketEntry, we have an extension now where we have the new states.
2. Removed the CSLM for keeping the HFilename based blocks info.  The 
evictBlocksByHfileName will have a perf impact as it has to iterate through all 
the entries to know each of the block entry belong to this file or not.  For 
that changed the evictBlocksByHfileName to be an async op way. A dedicated 
eviction thread will do this work.  ANy way even if we dont remove these blocks 
or have delay in removal, eventually these block will get removed as we have 
LRU algo for the eviction.  So when there are no space left for the new blocks 
addition, eviction would happen, removing unused blocks.  More over, eviction 
of blocks on HFile close is default off only (We have a config to turn this 
off).  When it is compaction , for the compacted files, we have evictByHFiles 
happening now. There will be  bit more delay for the actual removal of the 
blocks.   
But we save lot of heap memory per entry now as per this approach. The math is 
there in above comment

> Reduce the heap overhead for BucketCache
> ----------------------------------------
>
>                 Key: HBASE-17819
>                 URL: https://issues.apache.org/jira/browse/HBASE-17819
>             Project: HBase
>          Issue Type: Sub-task
>          Components: BucketCache
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>            Priority: Major
>             Fix For: 2.0.0
>
>         Attachments: HBASE-17819_V1.patch, HBASE-17819_V2.patch, 
> HBASE-17819_V3.patch
>
>
> We keep Bucket entry map in BucketCache.  Below is the math for heapSize for 
> the key , value into this map.
> BlockCacheKey
> ---------------
> String hfileName  -  Ref  - 4
> long offset  - 8
> BlockType blockType  - Ref  - 4
> boolean isPrimaryReplicaBlock  - 1
> Total  =  12 (Object) + 17 = 29
> BucketEntry
> ------------
> int offsetBase  -  4
> int length  - 4
> byte offset1  -  1
> byte deserialiserIndex  -  1
> long accessCounter  -  8
> BlockPriority priority  - Ref  - 4
> volatile boolean markedForEvict  -  1
> AtomicInteger refCount  -  16 + 4
> long cachedTime  -  8
> Total = 12 (Object) + 51 = 63
> ConcurrentHashMap Map.Entry  -  40
> blocksByHFile ConcurrentSkipListSet Entry  -  40
> Total = 29 + 63 + 80 = 172
> For 10 million blocks we will end up having 1.6GB of heap size.  
> This jira aims to reduce this as much as possible



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to