[ https://issues.apache.org/jira/browse/HBASE-9465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413298#comment-15413298 ]
Hudson commented on HBASE-9465: ------------------------------- FAILURE: Integrated in HBase-Trunk_matrix #1383 (See [https://builds.apache.org/job/HBase-Trunk_matrix/1383/]) HBASE-9465 Push entries to peer clusters serially (zhangduo: rev 5cadcd59aa57c9566349dc8551c958dc974e774e) * hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/ReplicationMetaCleaner.java * hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java * hbase-protocol/src/main/protobuf/WAL.proto * hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStateStore.java * hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java * hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/Replication.java * hbase-client/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java * hbase-client/src/main/java/org/apache/hadoop/hbase/MetaTableAccessor.java * hbase-client/src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java * hbase-common/src/main/resources/hbase-default.xml * hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManagerOnCluster.java * hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java * hbase-server/src/test/java/org/apache/hadoop/hbase/TestMetaTableAccessor.java * hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java * hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestSerialReplication.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSWALEntry.java * hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WAL.java * hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java > Push entries to peer clusters serially > -------------------------------------- > > Key: HBASE-9465 > URL: https://issues.apache.org/jira/browse/HBASE-9465 > Project: HBase > Issue Type: New Feature > Components: regionserver, Replication > Affects Versions: 2.0.0, 1.4.0 > Reporter: Honghua Feng > Assignee: Phil Yang > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-9465-branch-1-v1.patch, > HBASE-9465-branch-1-v1.patch, HBASE-9465-branch-1-v2.patch, > HBASE-9465-branch-1-v3.patch, HBASE-9465-branch-1-v4.patch, > HBASE-9465-branch-1-v4.patch, HBASE-9465-v1.patch, HBASE-9465-v2.patch, > HBASE-9465-v2.patch, HBASE-9465-v3.patch, HBASE-9465-v4.patch, > HBASE-9465-v5.patch, HBASE-9465-v6.patch, HBASE-9465-v6.patch, > HBASE-9465-v7.patch, HBASE-9465-v7.patch, HBASE-9465.pdf > > > When region-move or RS failure occurs in master cluster, the hlog entries > that are not pushed before region-move or RS-failure will be pushed by > original RS(for region move) or another RS which takes over the remained hlog > of dead RS(for RS failure), and the new entries for the same region(s) will > be pushed by the RS which now serves the region(s), but they push the hlog > entries of a same region concurrently without coordination. > This treatment can possibly lead to data inconsistency between master and > peer clusters: > 1. there are put and then delete written to master cluster > 2. due to region-move / RS-failure, they are pushed by different > replication-source threads to peer cluster > 3. if delete is pushed to peer cluster before put, and flush and > major-compact occurs in peer cluster before put is pushed to peer cluster, > the delete is collected and the put remains in peer cluster > In this scenario, the put remains in peer cluster, but in master cluster the > put is masked by the delete, hence data inconsistency between master and peer > clusters -- This message was sent by Atlassian JIRA (v6.3.4#6332)