[ https://issues.apache.org/jira/browse/HBASE-25032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Caroline reassigned HBASE-25032: -------------------------------- Assignee: Caroline (was: Sandeep Guggilam) > Wait for region server to become online before adding it to online servers in > Master > ------------------------------------------------------------------------------------ > > Key: HBASE-25032 > URL: https://issues.apache.org/jira/browse/HBASE-25032 > Project: HBase > Issue Type: Bug > Reporter: Sandeep Guggilam > Assignee: Caroline > Priority: Major > > As part of RS start up, RS reports for duty to Master . Master acknowledges > the request and adds it to the onlineServers list for further assigning any > regions to the RS > Once Master acknowledges the reportForDuty and sends back the response, RS > does a bunch of stuff like initializing replication sources etc before > becoming online. However, sometimes there could be an issue with initializing > replication sources when it is unable to connect to peer clusters because of > some kerberos configuration and there would be a delay of around 20 mins in > becoming online. > > Since master considers it online, it tries to assign regions and which fails > with ServerNotRunningYet exception, then the master tries to unassign which > again fails with the same exception leading the region to FAILED_CLOSE state. > > It would be good to have a check to see if the RS is ready to accept the > assignment requests before adding it to online servers list which would > account for any such delays as described above -- This message was sent by Atlassian Jira (v8.3.4#803005)