Jonathan Hsieh created HBASE-8735:
-------------------------------------

             Summary: "backup" hbase masters should eventually become active if 
the "primary" never comes up.
                 Key: HBASE-8735
                 URL: https://issues.apache.org/jira/browse/HBASE-8735
             Project: HBase
          Issue Type: Bug
          Components: master
    Affects Versions: 0.95.1
            Reporter: Jonathan Hsieh


I was taking a look at my rig running hbase 0.95.1rc1 and it go to a point 
where a "backup" master process was stuck waiting for a primary master that was 
never going to show up. One master was down, and the other master was up but 
thinking it was only supposed to be a backup and stuck in this loop [1].  

There was no master znode in zk, and the designated master was never going to 
succeed at starting (due to other reasons).

I've was killing active master and using then using the start-all.sh script to 
start any killed process back up again -- my guess is that the backup-masters 
special case maybe getting the backup stuck in that loop -- and that since the 
"primary" never shows up it gets stuck there.

In this case, I think the "backup" master should eventually take over instead 
of being stuck in the loop and having an hbase with no master.

[1] 
https://github.com/apache/hbase/blob/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java#L502L508)


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to