[ 
https://issues.apache.org/jira/browse/HBASE-16438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15942275#comment-15942275
 ] 

Anastasia Braginsky commented on HBASE-16438:
---------------------------------------------

Hi [~anoop.hbase], [~ram_krish], and [~carp84],

As I understand, in HBASE-16195, you are avoiding adding a chunk into MSLAB's 
list of chunks in order to make the garbage collection (GC) faster. So when 
there is no more references to this chunk from a SkipList/CellArayMap, the 
chunk's memory can be freed by GC. 

So now openScannerCount in MSLAB only serves for understanding when chunks can 
be returned back to pool (if they were allocated from pool).
And why do we care for those chunks allocated from pool and don't care for 
those taken care by GC (allocated by JVM)? The problem is as following:
When a Segment is removed (let's say due to flush to disk), it is already not 
referenced from MemStore and the Segment is closed, following close of its 
MSLAB.
However, there are might be still ongoing scans accessing the chunks of this 
segment. Those chunks cannot be de-allocated by GC because they have references 
from scan. But if we return the chunks to pool, they can be reused and the 
memory corrupted under scan's hands.

Now when we introduce the ChunkCreator (keeping chunkID to chunk map) you are 
afraid that we keep references to chunks for too long and delay the GC. I am 
saying all that so you can check me whether I understand it all right. If I am 
wrong please correct me. If I am right, then I have a suggestion for the 
following *simple* solution.

As Ram has suggested keep a boolean in chunk saying if from pool or not and... 
Just remove the chunkID to chunk mapping when segment is closed (for chunks 
that are not in pool)!

The scans (if they are still working) don't need the translation from chunk ID 
to chunk. This translation is needed only for flattening/compaction when the 
segment is still alive. How about that? :)



> Create a cell type so that chunk id is embedded in it
> -----------------------------------------------------
>
>                 Key: HBASE-16438
>                 URL: https://issues.apache.org/jira/browse/HBASE-16438
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 2.0.0
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-16438_1.patch, 
> HBASE-16438_3_ChunkCreatorwrappingChunkPool.patch, 
> HBASE-16438_4_ChunkCreatorwrappingChunkPool.patch, HBASE-16438.patch, 
> MemstoreChunkCell_memstoreChunkCreator_oldversion.patch, 
> MemstoreChunkCell_trunk.patch
>
>
> For CellChunkMap we may need a cell such that the chunk out of which it was 
> created, the id of the chunk be embedded in it so that when doing flattening 
> we can use the chunk id as a meta data. More details will follow once the 
> initial tasks are completed. 
> Why we need to embed the chunkid in the Cell is described by [~anastas] in 
> this remark over in parent issue 
> https://issues.apache.org/jira/browse/HBASE-14921?focusedCommentId=15244119&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15244119



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to