[ https://issues.apache.org/jira/browse/HBASE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13262236#comment-13262236 ]
Hudson commented on HBASE-5873: ------------------------------- Integrated in HBase-0.92 #391 (See [https://builds.apache.org/job/HBase-0.92/391/]) HBASE-5873 TimeOut Monitor thread should be started after atleast one region server registers. (Revision 1330558) Result = SUCCESS larsh : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/HMaster.java > TimeOut Monitor thread should be started after atleast one region server > registers. > ----------------------------------------------------------------------------------- > > Key: HBASE-5873 > URL: https://issues.apache.org/jira/browse/HBASE-5873 > Project: HBase > Issue Type: Bug > Affects Versions: 0.90.6 > Reporter: ramkrishna.s.vasudevan > Assignee: rajeshbabu > Priority: Minor > Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0 > > Attachments: 5873-trunk.txt, HBASE-5873.patch > > > Currently timeout monitor thread is started even before the region server has > registered with the master. > In timeout monitor we depend on the region server to be online > {code} > boolean allRSsOffline = this.serverManager.getOnlineServersList(). > isEmpty(); > {code} > Now when the master starts up it sees there are no online servers and hence > sets > allRSsOffline to true. > {code} > setAllRegionServersOffline(allRSsOffline); > {code} > So this.allRegionServersOffline is also true. > By this time an RS has come up, > Now timeout comes up again (after 10secs) in the next cycle he sees > allRSsOffline as false. > Hence > {code} > else if (this.allRegionServersOffline && !allRSsOffline) { > // if some RSs just came back online, we can start the > // the assignment right away > actOnTimeOut(regionState); > {code} > This condition makes him to take action based on timeout. > Because of this even if one Region assignment of ROOT is going on, this piece > of code triggers another assignment and thus we get RegionAlreadyinTransition > Exception. Later we need to wait for 30 mins for assigning ROOT itself. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira