[ https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vasu Mariyala updated HBASE-7709: --------------------------------- Attachment: 0.95-trunk-rev3.patch HBASE-7709-rev4.patch Thanks [~yuzhih...@gmail.com] for the review. Yes, I feel clusterIds may be better than just clusters. Attached the patches HBASE-7709-rev4.patch (0.94) and 0.95-trunk-rev3.patch (0.95 & trunk) which contain the method name changes. > Infinite loop possible in Master/Master replication > --------------------------------------------------- > > Key: HBASE-7709 > URL: https://issues.apache.org/jira/browse/HBASE-7709 > Project: HBase > Issue Type: Bug > Components: Replication > Affects Versions: 0.94.6, 0.95.1 > Reporter: Lars Hofhansl > Assignee: Vasu Mariyala > Fix For: 0.98.0, 0.94.12, 0.96.0 > > Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, > 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, HBASE-7709.patch, > HBASE-7709-rev1.patch, HBASE-7709-rev2.patch, HBASE-7709-rev3.patch, > HBASE-7709-rev4.patch > > > We just discovered the following scenario: > # Cluster A and B are setup in master/master replication > # By accident we had Cluster C replicate to Cluster A. > Now all edit originating from C will be bouncing between A and B. Forever! > The reason is that when the edit come in from C the cluster ID is already set > and won't be reset. > We have a couple of options here: > # Optionally only support master/master (not cycles of more than two > clusters). In that case we can always reset the cluster ID in the > ReplicationSource. That means that now cycles > 2 will have the data cycle > forever. This is the only option that requires no changes in the HLog format. > # Instead of a single cluster id per edit maintain a (unordered) set of > cluster id that have seen this edit. Then in ReplicationSource we drop any > edit that the sink has seen already. The is the cleanest approach, but it > might need a lot of data stored per edit if there are many clusters involved. > # Maintain a configurable counter of the maximum cycle side we want to > support. Could default to 10 (even maybe even just). Store a hop-count in the > WAL and the ReplicationSource increases that hop-count on each hop. If we're > over the max, just drop the edit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira