[ https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13753140#comment-13753140 ]
Hudson commented on HBASE-7709: ------------------------------- SUCCESS: Integrated in HBase-0.94-security #274 (See [https://builds.apache.org/job/HBase-0.94-security/274/]) HBASE-7709 Infinite loop possible in Master/Master replication (Vasu Mariyala) (larsh: rev 1518410) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Mutation.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/Replication.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestMasterReplication.java > Infinite loop possible in Master/Master replication > --------------------------------------------------- > > Key: HBASE-7709 > URL: https://issues.apache.org/jira/browse/HBASE-7709 > Project: HBase > Issue Type: Bug > Components: Replication > Affects Versions: 0.94.6, 0.95.1 > Reporter: Lars Hofhansl > Assignee: Vasu Mariyala > Fix For: 0.98.0, 0.94.12, 0.96.0 > > Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, > 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, 0.95-trunk-rev4.patch, > 7709-0.94-rev6.txt, HBASE-7709.patch, HBASE-7709-rev1.patch, > HBASE-7709-rev2.patch, HBASE-7709-rev3.patch, HBASE-7709-rev4.patch, > HBASE-7709-rev5.patch > > > We just discovered the following scenario: > # Cluster A and B are setup in master/master replication > # By accident we had Cluster C replicate to Cluster A. > Now all edit originating from C will be bouncing between A and B. Forever! > The reason is that when the edit come in from C the cluster ID is already set > and won't be reset. > We have a couple of options here: > # Optionally only support master/master (not cycles of more than two > clusters). In that case we can always reset the cluster ID in the > ReplicationSource. That means that now cycles > 2 will have the data cycle > forever. This is the only option that requires no changes in the HLog format. > # Instead of a single cluster id per edit maintain a (unordered) set of > cluster id that have seen this edit. Then in ReplicationSource we drop any > edit that the sink has seen already. The is the cleanest approach, but it > might need a lot of data stored per edit if there are many clusters involved. > # Maintain a configurable counter of the maximum cycle side we want to > support. Could default to 10 (even maybe even just). Store a hop-count in the > WAL and the ReplicationSource increases that hop-count on each hop. If we're > over the max, just drop the edit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira