[ https://issues.apache.org/jira/browse/HBASE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268525#comment-13268525 ]
Zhihong Yu commented on HBASE-5875: ----------------------------------- The following change is for debugging, right ? If so, please change log level accordingly: {code} + }catch(NotServingRegionException nsre){ + LOG.info("Failed verification of " + Bytes.toStringBinary(regionName) + + " at address=" + address + "; " + t); + throw nsre; {code} {code} + } catch (NotServingRegionException nsre) { + if(rit == true){ + // the root region location is available. {code} People unfamiliar with processRegionInTransitionAndBlockUntilAssigned() may get confused by the code above. rit actually means root region has come out of transition. So rit should be named accordingly. {code} + public void setServerShutdownHandlerEnabled(boolean setServerShutDownEnabled) { {code} The above method should be made package-private. Append 'ForTest' to the end of method name would help clarify its purpose. > Process RIT and Master restart may remove an online server considering it as > a dead server > ------------------------------------------------------------------------------------------ > > Key: HBASE-5875 > URL: https://issues.apache.org/jira/browse/HBASE-5875 > Project: HBase > Issue Type: Bug > Affects Versions: 0.92.1 > Reporter: ramkrishna.s.vasudevan > Assignee: ramkrishna.s.vasudevan > Fix For: 0.94.1 > > Attachments: HBASE-5875.patch, HBASE-5875_0.94.patch > > > If on master restart it finds the ROOT/META to be in RIT state, master tries > to assign the ROOT region through ProcessRIT. > Master will trigger the assignment and next will try to verify the Root > Region Location. > Root region location verification is done seeing if the RS has the region in > its online list. > If the master triggered assignment has not yet been completed in RS then the > verify root region location will fail. > Because it failed > {code} > splitLogAndExpireIfOnline(currentRootServer); > {code} > we do split log and also remove the server from online server list. Ideally > here there is nothing to do in splitlog as no region server was restarted. > So master, though the server is online, master just invalidates the region > server. > In a special case, if i have only one RS then my cluster will become non > operative. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira