[ 
https://issues.apache.org/jira/browse/HBASE-3210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015089#comment-13015089
 ] 

Subbu M Iyer commented on HBASE-3210:
-------------------------------------

First draft of my patch for review.

Here is what is being done now:

1. When the primary master's Abort is triggered from ZK Node listener during a 
ZK session expiry event, we first try to see if we can restore the ZK session. 
We ignore the abort trigger and continue working as primary master, if we can 
successfully restore the ZK session.

2. A successful ZK session recovery involves the following.
   a. Create a ZK Session 
   b. Try becoming the primary master again. (so that we don't step onto 
secondary master's toes)
   c. Initialize all ZK based trackers. This includes the AssignmentManager, 
CatalogTracker,      
      RegionServerTracker and ClusterSTatusTracker.
   d. Assign Root and Meta. (We just ensure that our local memory structures 
are correctly updated to reflect our earlier Root/Meta assignments)
   e. Process RIT if any, that came in during our blackout.

3. Refactored the Master startup logic so that we can reuse them during a 
master session recovery attempt.


  


> HBASE-1921 for the new master
> -----------------------------
>
>                 Key: HBASE-3210
>                 URL: https://issues.apache.org/jira/browse/HBASE-3210
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.92.0
>
>         Attachments: 
> HBASE-3210-When_the_Master_s_session_times_out_and_there_s_only_one,_cluster_is_wedged.patch
>
>
> HBASE-1921 was lost when writing the new master code. I guess it's going to 
> be much harder to implement now, but I think it's a critical feature to have 
> considering the reasons that brought me do it in the old master. There's 
> already a test in TestZooKeeper which has been disabled a while ago.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to