[ 
https://issues.apache.org/jira/browse/HBASE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13262236#comment-13262236
 ] 

Hudson commented on HBASE-5873:
-------------------------------

Integrated in HBase-0.92 #391 (See 
[https://builds.apache.org/job/HBase-0.92/391/])
    HBASE-5873 TimeOut Monitor thread should be started after atleast one 
region server registers. (Revision 1330558)

     Result = SUCCESS
larsh : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/HMaster.java

                
> TimeOut Monitor thread should be started after atleast one region server 
> registers.
> -----------------------------------------------------------------------------------
>
>                 Key: HBASE-5873
>                 URL: https://issues.apache.org/jira/browse/HBASE-5873
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.6
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: rajeshbabu
>            Priority: Minor
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: 5873-trunk.txt, HBASE-5873.patch
>
>
> Currently timeout monitor thread is started even before the region server has 
> registered with the master.
> In timeout monitor we depend on the region server to be online 
> {code}
> boolean allRSsOffline = this.serverManager.getOnlineServersList().
>         isEmpty();
> {code}
> Now when the master starts up it sees there are no online servers and hence 
> sets 
> allRSsOffline to true.
> {code}
> setAllRegionServersOffline(allRSsOffline);
> {code}
> So this.allRegionServersOffline is also true.
> By this time an RS has come up,
> Now timeout comes up again (after 10secs) in the next cycle he sees 
> allRSsOffline  as false.
> Hence 
> {code}
> else if (this.allRegionServersOffline && !allRSsOffline) {
>             // if some RSs just came back online, we can start the
>             // the assignment right away
>             actOnTimeOut(regionState);
> {code}
> This condition makes him to take action based on timeout.
> Because of this even if one Region assignment of ROOT is going on, this piece 
> of code triggers another assignment and thus we get RegionAlreadyinTransition 
> Exception. Later we need to wait for 30 mins for assigning ROOT itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to