[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

Vasu Mariyala (JIRA) Wed, 21 Aug 2013 13:02:50 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13746781#comment-13746781
 ]


Vasu Mariyala commented on HBASE-7709:
--------------------------------------

Attached the patches for 0.94 (HBASE-7709-rev3.patch) and 0.95, 
trunk(0.95-trunk-rev2.patch) which addresses the nits mentioned by Lars

0.94

   a) Changed PREFIX_CLUSTER_KEY to '.' (period as the column family names 
can't start with it)

   b) PREFIX_CONSUMED_CLUSTER_IDS changed to "_cs.id"

   c) A comment has been added in WALEdit mentioning that it is done for 
backwards compatibility and has been removed in 0.95.2+ releases

trunk/0.95
 
  a) From protobuf documentation 

     "repeated: this field can be repeated any number of times (including zero) 
in a well-formed message. The order of the repeated values will be preserved.".
     "optional: a well-formed message can have zero or one of this field (but 
not more than one)."

      So does repeated imply it is optional? Also, from the WALProtos.java the 
clusters list is initialized to empty list in the initFields() method so we 
would not get any   NullPointerException. May be, I would do more research on 
this.

  b) clusters in Import has been changed to use singleton

  c) addClusters has a method public Builder 
addClusters(org.apache.hadoop.hbase.protobuf.generated.HBaseProtos.UUID value) 
which takes the UUID as the parameter.

  d) Yes, this is used only to read the older log entries when migrating from 
0.94 to 0.95.2.

                
> Infinite loop possible in Master/Master replication
> ---------------------------------------------------
>
>                 Key: HBASE-7709
>                 URL: https://issues.apache.org/jira/browse/HBASE-7709
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.94.6, 0.95.1
>            Reporter: Lars Hofhansl
>            Assignee: Vasu Mariyala
>             Fix For: 0.98.0, 0.94.12, 0.96.0
>
>         Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
> 0.95-trunk-rev2.patch, HBASE-7709.patch, HBASE-7709-rev1.patch, 
> HBASE-7709-rev2.patch, HBASE-7709-rev3.patch
>
>
>  We just discovered the following scenario:
> # Cluster A and B are setup in master/master replication
> # By accident we had Cluster C replicate to Cluster A.
> Now all edit originating from C will be bouncing between A and B. Forever!
> The reason is that when the edit come in from C the cluster ID is already set 
> and won't be reset.
> We have a couple of options here:
> # Optionally only support master/master (not cycles of more than two 
> clusters). In that case we can always reset the cluster ID in the 
> ReplicationSource. That means that now cycles > 2 will have the data cycle 
> forever. This is the only option that requires no changes in the HLog format.
> # Instead of a single cluster id per edit maintain a (unordered) set of 
> cluster id that have seen this edit. Then in ReplicationSource we drop any 
> edit that the sink has seen already. The is the cleanest approach, but it 
> might need a lot of data stored per edit if there are many clusters involved.
> # Maintain a configurable counter of the maximum cycle side we want to 
> support. Could default to 10 (even maybe even just). Store a hop-count in the 
> WAL and the ReplicationSource increases that hop-count on each hop. If we're 
> over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

Reply via email to