[ https://issues.apache.org/jira/browse/HBASE-23591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Guanghao Zhang updated HBASE-23591: ----------------------------------- Fix Version/s: (was: 2.2.7) > Negative memStoreSizing > ----------------------- > > Key: HBASE-23591 > URL: https://issues.apache.org/jira/browse/HBASE-23591 > Project: HBase > Issue Type: Bug > Components: read replicas > Reporter: Szabolcs Bukros > Priority: Major > > After a flush on the replica region the memStoreSizing becomes negative: > {code:java} > 2019-12-17 08:31:59,983 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > 0beaae111b0f6e98bfde31ba35be5408 : Replaying flush marker action: > COMMIT_FLUSH table_name: "IntegrationTestRegionReplicaReplicati > on" encoded_region_name: "544affde3e027454f67c8ea46c8f69ee" > flush_sequence_number: 41392 store_flushes { family_name: "f1" > store_home_dir: "f1" flush_output: "3c48a23eac784a348a18e10e337d80a2" } > store_flushes { family_name: "f2" store_home_dir: "f2" flush_output: > "9a5283ec95694667b4ead2398af5f01e" } store_flushes { family_name: "f3" > store_home_dir: "f3" flush_output: "e6f25e6b0eca4d22af15d0626d0f8759" } > region_name: > "IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697.544affde3e027454f67c8ea46c8f69ee." > 2019-12-17 08:31:59,984 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > 0beaae111b0f6e98bfde31ba35be5408 : Received a flush commit marker with > seqId:41392 and a previous prepared snapshot was found > 2019-12-17 08:31:59,993 INFO org.apache.hadoop.hbase.regionserver.HStore: > Region: 0beaae111b0f6e98bfde31ba35be5408 added > hdfs://replica-1:8020/hbase/data/default/IntegrationTestRegionReplicaReplication/544affde3e027454f67c8ea46c8f69ee/f1/3c48a23eac784a348a18e10e337d80a2, > entries=32445, sequenceid=41392, filesize=27.6 M > 2019-12-17 08:32:00,016 INFO org.apache.hadoop.hbase.regionserver.HStore: > Region: 0beaae111b0f6e98bfde31ba35be5408 added > hdfs://replica-1:8020/hbase/data/default/IntegrationTestRegionReplicaReplication/544affde3e027454f67c8ea46c8f69ee/f2/9a5283ec95694667b4ead2398af5f01e, > entries=12264, sequenceid=41392, filesize=10.9 M > 2019-12-17 08:32:00,121 INFO org.apache.hadoop.hbase.regionserver.HStore: > Region: 0beaae111b0f6e98bfde31ba35be5408 added > hdfs://replica-1:8020/hbase/data/default/IntegrationTestRegionReplicaReplication/544affde3e027454f67c8ea46c8f69ee/f3/e6f25e6b0eca4d22af15d0626d0f8759, > entries=32379, sequenceid=41392, filesize=27.5 M > 2019-12-17 08:32:00,122 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > CustomLog decrMemStoreSize. Current: dataSize=135810071, > getHeapSize=148400960, getOffHeapSize=0, getCellsCount=167243 delta: > dataSizeDelta=155923644, heapSizeDelta=170112320, offHeapSizeDelta=0, > cellsCountDelta=188399 > 2019-12-17 08:32:00,122 ERROR org.apache.hadoop.hbase.regionserver.HRegion: > Asked to modify this region's > (IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be54 > 08.) memStoreSizing to a negative value which is incorrect. Current > memStoreSizing=135810071, delta=-155923644 > java.lang.Exception > at > org.apache.hadoop.hbase.regionserver.HRegion.checkNegativeMemStoreDataSize(HRegion.java:1323) > at > org.apache.hadoop.hbase.regionserver.HRegion.decrMemStoreSize(HRegion.java:1316) > at > org.apache.hadoop.hbase.regionserver.HRegion.decrMemStoreSize(HRegion.java:1303) > at > org.apache.hadoop.hbase.regionserver.HRegion.replayWALFlushCommitMarker(HRegion.java:5194) > at > org.apache.hadoop.hbase.regionserver.HRegion.replayWALFlushMarker(HRegion.java:5025) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doReplayBatchOp(RSRpcServices.java:1143) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.replay(RSRpcServices.java:2232) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:29754) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318) > {code} > I added some custom logging to the snapshot logic to be able to see snapshot > sizes: > {code:java} > 2019-12-17 08:31:56,900 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > 0beaae111b0f6e98bfde31ba35be5408 : Replaying flush marker action: START_FLUSH > table_name: "IntegrationTestRegionReplicaReplication" encoded_region_name: > "544affde3e027454f67c8ea46c8f69ee" flush_sequence_number: 41392 store_flushes > { family_name: "f1" store_home_dir: "f1" } store_flushes { family_name: "f2" > store_home_dir: "f2" } store_flushes { family_name: "f3" store_home_dir: "f3" > } region_name: > "IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697.544affde3e027454f67c8ea46c8f69ee." > 2019-12-17 08:31:56,900 INFO org.apache.hadoop.hbase.regionserver.HRegion: > Flushing 0beaae111b0f6e98bfde31ba35be5408 3/3 column families, > dataSize=126.49 MB heapSize=138.24 MB > 2019-12-17 08:31:56,900 WARN > org.apache.hadoop.hbase.regionserver.DefaultMemStore: Snapshot called again > without clearing previous. Doing nothing. Another ongoing flush or did we > fail last attempt? > 2019-12-17 08:31:56,900 DEBUG > org.apache.hadoop.hbase.regionserver.DefaultMemStore: CustomLog snapshot for > region: {ENCODED => 0beaae111b0f6e98bfde31ba35be5408, NAME => > 'IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be5408.', > STARTKEY => 'bbbbbbbb', ENDKEY => 'cccccccc', REPLICA_ID => 1} memstore: > dataSize=41660529, getHeapSize=45422304, getOffHeapSize=0, getCellsCount=49951 > 2019-12-17 08:31:56,901 DEBUG > org.apache.hadoop.hbase.regionserver.DefaultMemStore: CustomLog snapshot for > region: {ENCODED => 0beaae111b0f6e98bfde31ba35be5408, NAME => > 'IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be5408.', > STARTKEY => 'bbbbbbbb', ENDKEY => 'cccccccc', REPLICA_ID => 1} memstore: > dataSize=72246315, getHeapSize=78890240, getOffHeapSize=0, getCellsCount=88222 > 2019-12-17 08:31:56,901 WARN > org.apache.hadoop.hbase.regionserver.DefaultMemStore: Snapshot called again > without clearing previous. Doing nothing. Another ongoing flush or did we > fail last attempt? > 2019-12-17 08:31:56,901 DEBUG > org.apache.hadoop.hbase.regionserver.DefaultMemStore: CustomLog snapshot for > region: {ENCODED => 0beaae111b0f6e98bfde31ba35be5408, NAME => > 'IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be5408.', > STARTKEY => 'bbbbbbbb', ENDKEY => 'cccccccc', REPLICA_ID => 1} memstore: > dataSize=42016800, getHeapSize=45799776, getOffHeapSize=0, getCellsCount=50226 > 2019-12-17 08:31:56,901 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > 0beaae111b0f6e98bfde31ba35be5408 : Prepared flush with seqId:41392 > {code} > This shows that snapshots combined size is actually bigger than the region's > memStoreSizing. So when do we decrease memStoreSizing without actually > flushing? > A bit earlier in the log I found this: > {code:java} > 2019-12-17 08:31:43,581 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > 0beaae111b0f6e98bfde31ba35be5408 : Replaying flush marker action: START_FLUSH > table_name: "IntegrationTestRegionReplicaReplication" encoded_region_name: > "544affde3e027454f67c8ea46c8f69ee" flush_sequence_number: 40445 store_flushes > { family_name: "f2" store_home_dir: "f2" } region_name: > "IntegrationTestRegionReplicaReplication,bbbbbb > bb,1576599911697.544affde3e027454f67c8ea46c8f69ee." > 2019-12-17 08:31:43,582 WARN org.apache.hadoop.hbase.regionserver.HRegion: > 0beaae111b0f6e98bfde31ba35be5408 : Received a flush prepare marker with a > larger seqId: 40445 before clearing the previous one with seqId: 20747. > Ignoring > 2019-12-17 08:31:43,582 TRACE org.apache.hadoop.hbase.regionserver.HRegion: > 0beaae111b0f6e98bfde31ba35be5408 : Store files for region: > 2019-12-17 08:31:43,582 INFO org.apache.hadoop.hbase.regionserver.HRegion: > writing data to region > IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be5408. > with WAL disabled. Data may be lost in the event of a crash. > 2019-12-17 08:31:43,586 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > 0beaae111b0f6e98bfde31ba35be5408 : Replaying flush marker action: > COMMIT_FLUSH table_name: "IntegrationTestRegionReplicaReplication" > encoded_region_name: "544affde3e027454f67c8ea46c8f69ee" > flush_sequence_number: 40445 store_flushes { family_name: "f2" > store_home_dir: "f2" flush_output: "3f6a5302b36a4a65a6caa291a03f108d" } > region_name: > "IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697.544affde3e027454f67c8ea46c8f69ee." > 2019-12-17 08:31:43,586 WARN org.apache.hadoop.hbase.regionserver.HRegion: > 0beaae111b0f6e98bfde31ba35be5408 : Received a flush commit marker with larger > seqId: 40445 than what we have prepared with seqId: > 20747. Picking up new file and dropping prepared memstore snapshot > 2019-12-17 08:31:43,587 WARN org.apache.hadoop.hbase.regionserver.HRegion: > 0beaae111b0f6e98bfde31ba35be5408 : Unexpected: flush commit marker received > from store f2 but no associated flush context. Ignoring > 2019-12-17 08:31:43,587 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > CustomLog decrMemStoreSize. Current: dataSize=184492444, > getHeapSize=201769960, getOffHeapSize=0, getCellsCount=229450 delta: > dataSizeDelta=83677329, heapSizeDelta=91222080, offHeapSizeDelta=0, > cellsCountDelta=100177 > 2019-12-17 08:31:43,587 INFO org.apache.hadoop.hbase.regionserver.HRegion: > 0beaae111b0f6e98bfde31ba35be5408 : Not dropping memstore contents since > replayed flush seqId: 40445 is smaller than current seqId:40481 > {code} > Based on this the issues seems to originate from HRegion.replayFlushInStores. > The StoreFlushContext is missing, we show a warning and continue the > execution with the next StoreFlushDescriptor without actually flushing the > snapshot in question. However replayWALFlushCommitMarker is unaware of the > warn and continues assuming the flush was successful and decreasing the > memStoreSizing (and doing cleanup). > > We should consider decreasing the memStoreSizing on a per store level instead > of on the region level and do the negativity check before we do it, > potentially preventing a flush if needed, to make sure we keep the > memStoreSizing positive. -- This message was sent by Atlassian Jira (v8.3.4#803005)