[ https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555021#comment-16555021 ]
Allan Yang commented on HBASE-20867: ------------------------------------ [~stack], Thanks for reviewing! Will upload a patch later. > RS may get killed while master restarts > --------------------------------------- > > Key: HBASE-20867 > URL: https://issues.apache.org/jira/browse/HBASE-20867 > Project: HBase > Issue Type: Sub-task > Affects Versions: 3.0.0, 2.1.0, 2.0.1 > Reporter: Allan Yang > Assignee: Allan Yang > Priority: Major > Fix For: 3.0.0, 2.0.2, 2.1.1 > > Attachments: HBASE-20867.branch-2.0.001.patch, > HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch, > HBASE-20867.branch-2.0.004.patch, HBASE-20867.branch-2.0.005.patch > > > If the master is dispatching a RPC call to RS when aborting. A connection > exception may be thrown by the RPC layer(A IOException with "Connection > closed" message in this case). The RSProcedureDispatcher will regard is as an > un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, > which will expire the RS. > Actually, the RS is very healthy, only the master is restarting. > I think we should deal with those kinds of connection exceptions in > RSProcedureDispatcher and retry the rpc call -- This message was sent by Atlassian JIRA (v7.6.3#76005)