[
https://issues.apache.org/jira/browse/HBASE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13263135#comment-13263135
]
Hudson commented on HBASE-5873:
-------------------------------
Integrated in HBase-0.94-security #22 (See
[https://builds.apache.org/job/HBase-0.94-security/22/])
HBASE-5873 TimeOut Monitor thread should be started after atleast one
region server registers. (Revision 1330549)
Result = FAILURE
larsh :
Files :
*
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
> TimeOut Monitor thread should be started after atleast one region server
> registers.
> -----------------------------------------------------------------------------------
>
> Key: HBASE-5873
> URL: https://issues.apache.org/jira/browse/HBASE-5873
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.6
> Reporter: ramkrishna.s.vasudevan
> Assignee: rajeshbabu
> Priority: Minor
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5873-trunk.txt, HBASE-5873.patch
>
>
> Currently timeout monitor thread is started even before the region server has
> registered with the master.
> In timeout monitor we depend on the region server to be online
> {code}
> boolean allRSsOffline = this.serverManager.getOnlineServersList().
> isEmpty();
> {code}
> Now when the master starts up it sees there are no online servers and hence
> sets
> allRSsOffline to true.
> {code}
> setAllRegionServersOffline(allRSsOffline);
> {code}
> So this.allRegionServersOffline is also true.
> By this time an RS has come up,
> Now timeout comes up again (after 10secs) in the next cycle he sees
> allRSsOffline as false.
> Hence
> {code}
> else if (this.allRegionServersOffline && !allRSsOffline) {
> // if some RSs just came back online, we can start the
> // the assignment right away
> actOnTimeOut(regionState);
> {code}
> This condition makes him to take action based on timeout.
> Because of this even if one Region assignment of ROOT is going on, this piece
> of code triggers another assignment and thus we get RegionAlreadyinTransition
> Exception. Later we need to wait for 30 mins for assigning ROOT itself.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira