[ 
https://issues.apache.org/jira/browse/HBASE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13138637#comment-13138637
 ] 

Chris Trezzo commented on HBASE-2611:
-------------------------------------

I think adding the ability to atomically move a znode and all its child znodes 
might be a pretty invasive change. I couldn't seem to find any utility package 
for this on the net, but there is a patch in Zookeeper 
([ZOOKEEPER-965|https://issues.apache.org/jira/browse/ZOOKEEPER-965]) 
implementing atomic batch operations that is scheduled for 3.4.

I thought about the problem a little bit, and after conferring with Lars, I 
think we might not need the atomic move (although it would definitely make it 
simpler).

Below is some pseudo code for the algorithm I came up with. It is very similar 
to what you suggested above. Both intentions and locks are tagged with the 
region server they point to (i.e. locks are tagged with the rs that holds them, 
and intentions are tagged with the rs they intend to lock). Intentions are at 
the same level in the znode structure as locks. It is a recursive, depth first 
algorithm.

Questions/comments/suggestions always appreciated.

Chris

{code}

//this method is the top-level failover method (i.e. NodeFailoverWorker.run())
failOverRun(FailedNode a) {
  recordIntention(a, this);
  if(getLock(a, this)) {
    //transfer all queues to local node
    moveState(a, this, this);
  }
  else {
    deleteIntention(a, this);
    return;
  }
  replicateQueues();
}

moveState(NodeToMove a, CurrentNode c, TargetNode t) {
  if(lock exists on a) {
    if(lock on a is owned by c) {
      moveStateHelper(a, c, t);
    }
    else {
      //someone else has the lock and is handling
      //the failover
      deleteIntention(a, c);
    }
  }
  else {
    if(queue znodes exist) {
      //we know that this node has queues to transfer
      if(getLock(a, c)) {
        moveStateHelper(a, c, t);
      }
      else {
        deleteIntention(a, c);
      }
    }
    else {
      //we know that this node is being deleted
      deleteState(a);
      deleteIntention(a, c);
    }
  }
}

moveStateHelper(NodeToMove a, CurrentNode c, TargetNode t) {
  for(every intention b of a) {
    moveState(b, a, t);
  }
  //we need to safely handle the case where we try to copy
  //queues that have already been copied
  copy all queues in a to t;
  deleteState(a);
  deleteIntention(a, c);
}

deleteState(NodeToDelete d) {
  //there is no need to traverse down the tree at all
  //because at this point everything below us should have
  //been deleted
  //
  //we need to safely handle the case where we attempt to delete
  //nodes that have already been deleted

  delete entire node;
}

{code}
                
> Handle RS that fails while processing the failure of another one
> ----------------------------------------------------------------
>
>                 Key: HBASE-2611
>                 URL: https://issues.apache.org/jira/browse/HBASE-2611
>             Project: HBase
>          Issue Type: Sub-task
>          Components: replication
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>
> HBASE-2223 doesn't manage region servers that fail while doing the transfer 
> of HLogs queues from other region servers that failed. Devise a reliable way 
> to do it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to