[ 
https://issues.apache.org/jira/browse/HBASE-20542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16683492#comment-16683492
 ] 

Eshcar Hillel commented on HBASE-20542:
---------------------------------------

bq. if cellSize larger than inmemoryFlushSize, AbstractMemStore#doAddOrUpsert 
will never break the while loop?

In-memory compaction is aimed for columns with small size cells, and we should 
not see cells that are bigger than in-memory flush size.

However, your claim is correct:
If cellSize > inmemoryFlushSize, CompactingMemStore#shouldFlushInMemory breaks 
the while loop and returns true as the size is above flush threshold.
In turn CompactingMemStore#checkAndAddToActiveSize flushes active segment into 
compaction pipeline, dispatches in-memory compaction thread, and returns false.
This results in CompactingMemStore#preUpdate releasing the lock on the active 
segment and returning false, potentially hitting an infinite loop.
We may want to throw an exception here instead of looping over and over again. 

One way for the application to deal with this issue is to set larger portion 
for active segments, so that cell size is always smaller than inmemoryFlushSize.

> Better heap utilization for IMC with MSLABs
> -------------------------------------------
>
>                 Key: HBASE-20542
>                 URL: https://issues.apache.org/jira/browse/HBASE-20542
>             Project: HBase
>          Issue Type: Sub-task
>          Components: in-memory-compaction
>            Reporter: Eshcar Hillel
>            Assignee: Eshcar Hillel
>            Priority: Major
>             Fix For: 3.0.0, 2.2.0
>
>         Attachments: HBASE-20542-addendum.master.005.patch, 
> HBASE-20542.branch-2.001.patch, HBASE-20542.branch-2.003.patch, 
> HBASE-20542.branch-2.004.patch, HBASE-20542.branch-2.005.patch, 
> HBASE-20542.master.003.patch, HBASE-20542.master.005-addendum.patch, run.sh, 
> workloada, workloadc, workloadx, workloady
>
>
> Following HBASE-20188 we realized in-memory compaction combined with MSLABs 
> may suffer from heap under-utilization due to internal fragmentation. This 
> jira presents a solution to circumvent this problem. The main idea is to have 
> each update operation check if it will cause overflow in the active segment 
> *before* it is writing the new value (instead of checking the size after the 
> write is completed), and if it is then the active segment is atomically 
> swapped with a new empty segment, and is pushed (full-yet-not-overflowed) to 
> the compaction pipeline. Later on the IMC deamon will run its compaction 
> operation (flatten index/merge indices/data compaction) in the background. 
> Some subtle concurrency issues should be handled with care. We next elaborate 
> on them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to