[ https://issues.apache.org/jira/browse/HBASE-16162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15360612#comment-15360612 ]
Anoop Sam John commented on HBASE-16162: ---------------------------------------- We have allowCompaction AtomicBoolean read and check in flushInMemory() again.. This fact is used in one of the test. We can change this test and avoid this extra read IMO. WDYT? > Compacting Memstore : unnecessary push of active segments to pipeline > --------------------------------------------------------------------- > > Key: HBASE-16162 > URL: https://issues.apache.org/jira/browse/HBASE-16162 > Project: HBase > Issue Type: Sub-task > Reporter: Anoop Sam John > Assignee: Anoop Sam John > Priority: Critical > Attachments: HBASE-16162.patch, HBASE-16162_V2.patch, > HBASE-16162_V3.patch > > > We have flow like this > {code} > protected void checkActiveSize() { > if (shouldFlushInMemory()) { > InMemoryFlushRunnable runnable = new InMemoryFlushRunnable(); > } > getPool().execute(runnable); > } > } > private boolean shouldFlushInMemory() { > if(getActive().getSize() > inmemoryFlushSize) { > // size above flush threshold > return (allowCompaction.get() && !inMemoryFlushInProgress.get()); > } > return false; > } > void flushInMemory() throws IOException { > // Phase I: Update the pipeline > getRegionServices().blockUpdates(); > try { > MutableSegment active = getActive(); > pushActiveToPipeline(active); > } finally { > getRegionServices().unblockUpdates(); > } > // Phase II: Compact the pipeline > try { > if (allowCompaction.get() && > inMemoryFlushInProgress.compareAndSet(false, true)) { > // setting the inMemoryFlushInProgress flag again for the case this > method is invoked > // directly (only in tests) in the common path setting from true to > true is idempotent > // Speculative compaction execution, may be interrupted if flush is > forced while > // compaction is in progress > compactor.startCompaction(); > } > {code} > So every write of cell will produce the check checkActiveSize(). When we > are at border of in mem flush, many threads doing writes to this memstore > can get this checkActiveSize () to pass. Yes the AtomicBoolean is still > false only. It is turned ON after some time once the new thread is started > run and it push the active to pipeline etc. > In the new thread code of inMemFlush, we dont have any size check. It just > takes the active segment and pushes that to pipeline. Yes we dont allow any > new writes to memstore at this time. But before that write lock on > region, other handler thread also might have added entry to this thread pool. > When the 1st one finishes, it releases the lock on region and handler > threads trying for write to memstore, might get lock and add some data. Now > this 2nd in mem flush thread may get a chance and get the lock and so it just > takes current active segment and flush that in memory ! This will produce > very small sized segments to pipeline. -- This message was sent by Atlassian JIRA (v6.3.4#6332)