[ https://issues.apache.org/jira/browse/HBASE-22287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120259#comment-17120259 ]
Hudson commented on HBASE-22287: -------------------------------- Results for branch master [build #1741 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/1741/]: (/) *{color:green}+1 overall{color}* ---- details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/master/1741/General_20Nightly_20Build_20Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/master/1698/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/master/1741/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://builds.apache.org/job/HBase%20Nightly/job/master/1741/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > inifinite retries on failed server in RSProcedureDispatcher > ----------------------------------------------------------- > > Key: HBASE-22287 > URL: https://issues.apache.org/jira/browse/HBASE-22287 > Project: HBase > Issue Type: Bug > Reporter: Sergey Shelukhin > Assignee: Michael Stack > Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.0 > > > We observed this recently on some cluster, I'm still investigating the root > cause however seems like the retries should have special handling for this > exception; and separately probably a cap on number of retries > {noformat} > 2019-04-20 04:24:27,093 WARN [RSProcedureDispatcher-pool4-t1285] > procedure.RSProcedureDispatcher: request to server ,17020,1555742560432 > failed due to java.io.IOException: Call to :17020 failed on local exception: > org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the > failed servers list: :17020, try=26603, retrying... > {noformat} > The corresponding worker is stuck -- This message was sent by Atlassian Jira (v8.3.4#803005)