[ https://issues.apache.org/jira/browse/HBASE-21325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16741003#comment-16741003 ]
Hudson commented on HBASE-21325: -------------------------------- Results for branch branch-1 [build #629 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/629/]: (x) *{color:red}-1 overall{color}* ---- details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/629//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/629//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/629//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > Force to terminate regionserver when abort hang in somewhere > ------------------------------------------------------------ > > Key: HBASE-21325 > URL: https://issues.apache.org/jira/browse/HBASE-21325 > Project: HBase > Issue Type: Improvement > Affects Versions: 3.0.0, 2.2.0, 2.1.1, 2.0.2 > Reporter: Duo Zhang > Assignee: Guanghao Zhang > Priority: Major > Fix For: 3.0.0, 1.5.0, 2.2.0 > > Attachments: HBASE-21325.master.001.patch, > HBASE-21325.master.001.patch, HBASE-21325.master.002.patch, > HBASE-21325.master.003.patch, HBASE-21325.master.004.patch, > HBASE-21325.master.005.patch > > > When testing sync replication, I found that, if I transit the remote cluster > to DA, while the local cluster is still in A, the region server will hang > when shutdown. As the fsOk flag only test the local cluster(which is > reasonable), we will enter the waitOnAllRegionsToClose, and since the WAL is > broken(the remote wal directory is gone) so we will never succeed. And this > lead to an infinite wait inside waitOnAllRegionsToClose. > So I think here we should have an upper bound for the wait time in > waitOnAllRegionsToClose method. -- This message was sent by Atlassian JIRA (v7.6.3#76005)