[ https://issues.apache.org/jira/browse/HBASE-20698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Guanghao Zhang reopened HBASE-20698: ------------------------------------ Reopen this as I found another problem... When a region server expired, it will be removed from onlineServers. Now getServerVersion may return 0 when the server is not in onlineServers. RSProcedureDispatcher is a ServerListener and there are race between ServerManager and RSProcedureDispatcher. For a RefreshPeerProcedure which target server expired, addOperationToNode may succeed but may get version 0 when remoteDispatch. Then this RefreshPeerProcedure will fail to dispatch... > Master don't record right server version until new started region server call > regionServerReport method > ------------------------------------------------------------------------------------------------------- > > Key: HBASE-20698 > URL: https://issues.apache.org/jira/browse/HBASE-20698 > Project: HBase > Issue Type: Bug > Components: proc-v2 > Affects Versions: 2.0.0 > Reporter: Guanghao Zhang > Assignee: Guanghao Zhang > Priority: Major > Fix For: 2.0.1 > > Attachments: HBASE-20698.master.001.patch, > HBASE-20698.master.002.patch, HBASE-20698.master.003.patch > > > When a new region server started, it will call regionServerStartup first. > Master will record this server as a new online server and may dispath > RemoteProcedure to the new server. But master only record the server version > when the new region server call regionServerReport method. Dispatch a new > RemoteProcedure to this new regionserver will fail if version is not right. > {code:java} > @Override > protected void remoteDispatch(final ServerName serverName, > final Set<RemoteProcedure> remoteProcedures) { > final int rsVersion = > master.getAssignmentManager().getServerVersion(serverName); > if (rsVersion >= RS_VERSION_WITH_EXEC_PROCS) { > LOG.trace("Using procedure batch rpc execution for serverName={} > version={}", > serverName, rsVersion); > submitTask(new ExecuteProceduresRemoteCall(serverName, > remoteProcedures)); > } else { > LOG.info(String.format( > "Fallback to compat rpc execution for serverName=%s version=%s", > serverName, rsVersion)); > submitTask(new CompatRemoteProcedureResolver(serverName, > remoteProcedures)); > } > } > {code} > The above code use version to resolve compatibility problem. So dispatch will > work right for old version region server. But for RefreshPeerProcedure, it is > new since hbase 2.0. So RefreshPeerProcedure don't need this. But the new > region server version is not right, it will use CompatRemoteProcedureResolver > for RefreshPeerProcedure, too. So the RefreshPeerProcedure can't be executed > rightly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)