[ https://issues.apache.org/jira/browse/HBASE-26026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Duo Zhang updated HBASE-26026: ------------------------------ Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Pushed to branch-2.3+. Thanks [~comnetwork] for contributing. > HBase Write may be stuck forever when using CompactingMemStore > -------------------------------------------------------------- > > Key: HBASE-26026 > URL: https://issues.apache.org/jira/browse/HBASE-26026 > Project: HBase > Issue Type: Bug > Components: in-memory-compaction > Affects Versions: 3.0.0-alpha-1, 2.3.0, 2.4.0 > Reporter: chenglei > Assignee: chenglei > Priority: Critical > Fix For: 2.5.0, 3.0.0-alpha-2, 2.4.6, 2.3.7 > > > Sometimes I observed that HBase Write might be stuck in my hbase cluster > which enabling {{CompactingMemStore}}. I have simulated the problem by unit > test in my PR. > The problem is caused by {{CompactingMemStore.checkAndAddToActiveSize}} : > {code:java} > 425 private boolean checkAndAddToActiveSize(MutableSegment currActive, Cell > cellToAdd, > 426 MemStoreSizing memstoreSizing) { > 427 if (shouldFlushInMemory(currActive, cellToAdd, memstoreSizing)) { > 428 if (currActive.setInMemoryFlushed()) { > 429 flushInMemory(currActive); > 430 if (setInMemoryCompactionFlag()) { > 431 // The thread is dispatched to do in-memory compaction in the > background > ...... > } > {code} > In line 427, {{shouldFlushInMemory}} checking if {{currActive.getDataSize}} > adding the size of {{cellToAdd}} exceeds > {{CompactingMemStore.inmemoryFlushSize}},if true, then {{currActive}} > should be flushed, {{currActive.setInMemoryFlushed()}} is invoked in line > 428 : > {code:java} > public boolean setInMemoryFlushed() { > return flushed.compareAndSet(false, true); > } > {code} > After sucessfully set {{currActive.flushed}} to true, in above line 429 > {{flushInMemory(currActive)}} invokes > {{CompactingMemStore.pushActiveToPipeline}} : > {code:java} > protected void pushActiveToPipeline(MutableSegment currActive) { > if (!currActive.isEmpty()) { > pipeline.pushHead(currActive); > resetActive(); > } > } > {code} > In above {{CompactingMemStore.pushActiveToPipeline}} method , if the > {{currActive.cellSet}} is empty, then nothing is done. Due to concurrent > writes and because we first add cell size to {{currActive.getDataSize}} and > then actually add cell to {{currActive.cellSet}}, it is possible that > {{currActive.getDataSize}} could not accommodate {{cellToAdd}} but > {{currActive.cellSet}} is still empty if pending writes which not yet add > cells to {{currActive.cellSet}}. > So if the {{currActive.cellSet}} is empty now, then no {{ActiveSegment}} is > created, and new writes still continue target to {{currActive}}, but > {{currActive.flushed}} is true, {{currActive}} could not enter > {{flushInMemory(currActive)}} again,and new {{ActiveSegment}} could not be > created forever ! In the end all writes would be stuck. > In my opinion , once {{currActive.flushed}} is set true, it could not > continue use as {{ActiveSegment}} , and because of concurrent pending writes, > only after {{currActive.updatesLock.writeLock()}} is acquired(i.e. > {{currActive.waitForUpdates}} is called) in > {{CompactingMemStore.inMemoryCompaction}} ,we can safely say {{currActive}} > is empty or not. > My fix is remove the {{if (!currActive.isEmpty())}} check here and left the > check to background {{InMemoryCompactionRunnable}} after > {{currActive.waitForUpdates}} is called. An alternative fix is we use > synchronization mechanism in {{checkAndAddToActiveSize}} method to prevent > all writes , wait for all pending write completed(i.e. > currActive.waitForUpdates is called) and if {{currActive}} is still empty > ,then we set {{currActive.flushed}} back to false,but I am not inclined to > use so heavy synchronization in write path, and I think we would better > maintain lockless implementation for {{CompactingMemStore.add}} method just > as now and {{currActive.waitForUpdates}} would better be left in background > {{InMemoryCompactionRunnable}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)