[jira] [Commented] (HBASE-21559) The RestoreSnapshotFromClientTestBase related UT are flaky
[ https://issues.apache.org/jira/browse/HBASE-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16715414#comment-16715414 ] stack commented on HBASE-21559: --- This cleaned up the failures nicely. Thanks [~openinx]. > The RestoreSnapshotFromClientTestBase related UT are flaky > -- > > Key: HBASE-21559 > URL: https://issues.apache.org/jira/browse/HBASE-21559 > Project: HBase > Issue Type: Bug >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.2, 2.0.4 > > Attachments: HBASE-21559.v1.patch, HBASE-21559.v2.patch, > TEST-org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.xml, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions-output.txt, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.txt > > > The related UT are: > * TestRestoreSnapshotFromClientAfterSplittingRegions > * TestRestoreSnapshotFromClientWithRegionReplicas > * TestMobRestoreSnapshotFromClientAfterSplittingRegions > I guess the main problem is: a dead lock between SplitTableRegionProcedure > and SnapshotProcedure.. > Attached logs from the failed UT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21559) The RestoreSnapshotFromClientTestBase related UT are flaky
[ https://issues.apache.org/jira/browse/HBASE-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16712691#comment-16712691 ] Hudson commented on HBASE-21559: Results for branch master [build #650 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/650/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/master/650//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/master/650//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/master/650//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > The RestoreSnapshotFromClientTestBase related UT are flaky > -- > > Key: HBASE-21559 > URL: https://issues.apache.org/jira/browse/HBASE-21559 > Project: HBase > Issue Type: Bug >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.2, 2.0.4 > > Attachments: HBASE-21559.v1.patch, HBASE-21559.v2.patch, > TEST-org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.xml, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions-output.txt, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.txt > > > The related UT are: > * TestRestoreSnapshotFromClientAfterSplittingRegions > * TestRestoreSnapshotFromClientWithRegionReplicas > * TestMobRestoreSnapshotFromClientAfterSplittingRegions > I guess the main problem is: a dead lock between SplitTableRegionProcedure > and SnapshotProcedure.. > Attached logs from the failed UT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21559) The RestoreSnapshotFromClientTestBase related UT are flaky
[ https://issues.apache.org/jira/browse/HBASE-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16712675#comment-16712675 ] Hudson commented on HBASE-21559: Results for branch branch-2 [build #1544 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1544/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1544//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1544//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1544//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > The RestoreSnapshotFromClientTestBase related UT are flaky > -- > > Key: HBASE-21559 > URL: https://issues.apache.org/jira/browse/HBASE-21559 > Project: HBase > Issue Type: Bug >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.2, 2.0.4 > > Attachments: HBASE-21559.v1.patch, HBASE-21559.v2.patch, > TEST-org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.xml, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions-output.txt, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.txt > > > The related UT are: > * TestRestoreSnapshotFromClientAfterSplittingRegions > * TestRestoreSnapshotFromClientWithRegionReplicas > * TestMobRestoreSnapshotFromClientAfterSplittingRegions > I guess the main problem is: a dead lock between SplitTableRegionProcedure > and SnapshotProcedure.. > Attached logs from the failed UT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21559) The RestoreSnapshotFromClientTestBase related UT are flaky
[ https://issues.apache.org/jira/browse/HBASE-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16712338#comment-16712338 ] stack commented on HBASE-21559: --- Nice fix [~openinx]. Running it locally, I can't make the hang anymore. Thanks. > The RestoreSnapshotFromClientTestBase related UT are flaky > -- > > Key: HBASE-21559 > URL: https://issues.apache.org/jira/browse/HBASE-21559 > Project: HBase > Issue Type: Bug >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.2, 2.0.4 > > Attachments: HBASE-21559.v1.patch, HBASE-21559.v2.patch, > TEST-org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.xml, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions-output.txt, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.txt > > > The related UT are: > * TestRestoreSnapshotFromClientAfterSplittingRegions > * TestRestoreSnapshotFromClientWithRegionReplicas > * TestMobRestoreSnapshotFromClientAfterSplittingRegions > I guess the main problem is: a dead lock between SplitTableRegionProcedure > and SnapshotProcedure.. > Attached logs from the failed UT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21559) The RestoreSnapshotFromClientTestBase related UT are flaky
[ https://issues.apache.org/jira/browse/HBASE-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16712323#comment-16712323 ] Hudson commented on HBASE-21559: Results for branch branch-2.1 [build #664 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/664/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/664//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/664//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/664//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > The RestoreSnapshotFromClientTestBase related UT are flaky > -- > > Key: HBASE-21559 > URL: https://issues.apache.org/jira/browse/HBASE-21559 > Project: HBase > Issue Type: Bug >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.2, 2.0.4, 2.0.5 > > Attachments: HBASE-21559.v1.patch, HBASE-21559.v2.patch, > TEST-org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.xml, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions-output.txt, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.txt > > > The related UT are: > * TestRestoreSnapshotFromClientAfterSplittingRegions > * TestRestoreSnapshotFromClientWithRegionReplicas > * TestMobRestoreSnapshotFromClientAfterSplittingRegions > I guess the main problem is: a dead lock between SplitTableRegionProcedure > and SnapshotProcedure.. > Attached logs from the failed UT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21559) The RestoreSnapshotFromClientTestBase related UT are flaky
[ https://issues.apache.org/jira/browse/HBASE-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16712311#comment-16712311 ] Hudson commented on HBASE-21559: Results for branch branch-2.0 [build #1143 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1143/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1143//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1143//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1143//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > The RestoreSnapshotFromClientTestBase related UT are flaky > -- > > Key: HBASE-21559 > URL: https://issues.apache.org/jira/browse/HBASE-21559 > Project: HBase > Issue Type: Bug >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.2, 2.0.4, 2.0.5 > > Attachments: HBASE-21559.v1.patch, HBASE-21559.v2.patch, > TEST-org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.xml, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions-output.txt, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.txt > > > The related UT are: > * TestRestoreSnapshotFromClientAfterSplittingRegions > * TestRestoreSnapshotFromClientWithRegionReplicas > * TestMobRestoreSnapshotFromClientAfterSplittingRegions > I guess the main problem is: a dead lock between SplitTableRegionProcedure > and SnapshotProcedure.. > Attached logs from the failed UT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21559) The RestoreSnapshotFromClientTestBase related UT are flaky
[ https://issues.apache.org/jira/browse/HBASE-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16712174#comment-16712174 ] Duo Zhang commented on HBASE-21559: --- Pushed to branch-2.0+. Let's see how it works. > The RestoreSnapshotFromClientTestBase related UT are flaky > -- > > Key: HBASE-21559 > URL: https://issues.apache.org/jira/browse/HBASE-21559 > Project: HBase > Issue Type: Bug >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.2, 2.0.4, 2.0.5 > > Attachments: HBASE-21559.v1.patch, HBASE-21559.v2.patch, > TEST-org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.xml, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions-output.txt, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.txt > > > The related UT are: > * TestRestoreSnapshotFromClientAfterSplittingRegions > * TestRestoreSnapshotFromClientWithRegionReplicas > * TestMobRestoreSnapshotFromClientAfterSplittingRegions > I guess the main problem is: a dead lock between SplitTableRegionProcedure > and SnapshotProcedure.. > Attached logs from the failed UT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21559) The RestoreSnapshotFromClientTestBase related UT are flaky
[ https://issues.apache.org/jira/browse/HBASE-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711921#comment-16711921 ] Hadoop QA commented on HBASE-21559: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange} 0m 0s{color} | {color:orange} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 10s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 25s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 19s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 48s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 35s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 44s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 34s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}225m 27s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}271m 1s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21559 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12950848/HBASE-21559.v2.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux daa900739133 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 12e75a8a63 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/15211/testReport/ | | Max. process+thread count | 5021 (vs. ulimit of 1) | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/15
[jira] [Commented] (HBASE-21559) The RestoreSnapshotFromClientTestBase related UT are flaky
[ https://issues.apache.org/jira/browse/HBASE-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711604#comment-16711604 ] Hadoop QA commented on HBASE-21559: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange} 0m 0s{color} | {color:orange} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 33s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 58s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 15s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 13s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 7s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 9m 27s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}131m 44s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}171m 4s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21559 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12950827/HBASE-21559.v1.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 60403a0126a1 4.4.0-139-generic #165~14.04.1-Ubuntu SMP Wed Oct 31 10:55:11 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 12e75a8a63 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/15208/testReport/ | | Max. process+thread count | 4432 (vs. ulimit of 1) | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE
[jira] [Commented] (HBASE-21559) The RestoreSnapshotFromClientTestBase related UT are flaky
[ https://issues.apache.org/jira/browse/HBASE-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711538#comment-16711538 ] Zheng Hu commented on HBASE-21559: -- bq. But at t8, the TakeSnapshotHandler is already in the map right? Think about the above case again, should be no problem if move the v!=null & v.isFinished() out of the computeIfPresent. because the status of v will transform from not finished to finished. if get a not finished state, then the STRP won't process, it's OK even if someone change it from not finished to finished. bq. so the problem here is that the state should be volatile. the state is volatile now. > The RestoreSnapshotFromClientTestBase related UT are flaky > -- > > Key: HBASE-21559 > URL: https://issues.apache.org/jira/browse/HBASE-21559 > Project: HBase > Issue Type: Bug >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.2, 2.0.4, 2.0.5 > > Attachments: HBASE-21559.v1.patch, > TEST-org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.xml, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions-output.txt, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.txt > > > The related UT are: > * TestRestoreSnapshotFromClientAfterSplittingRegions > * TestRestoreSnapshotFromClientWithRegionReplicas > * TestMobRestoreSnapshotFromClientAfterSplittingRegions > I guess the main problem is: a dead lock between SplitTableRegionProcedure > and SnapshotProcedure.. > Attached logs from the failed UT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21559) The RestoreSnapshotFromClientTestBase related UT are flaky
[ https://issues.apache.org/jira/browse/HBASE-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711506#comment-16711506 ] Duo Zhang commented on HBASE-21559: --- But at t8, the TakeSnapshotHandler is already in the map right? We will not hold the map lock when changing its state, so the problem here is that the state should be volatile. > The RestoreSnapshotFromClientTestBase related UT are flaky > -- > > Key: HBASE-21559 > URL: https://issues.apache.org/jira/browse/HBASE-21559 > Project: HBase > Issue Type: Bug >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.2, 2.0.4, 2.0.5 > > Attachments: HBASE-21559.v1.patch, > TEST-org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.xml, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions-output.txt, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.txt > > > The related UT are: > * TestRestoreSnapshotFromClientAfterSplittingRegions > * TestRestoreSnapshotFromClientWithRegionReplicas > * TestMobRestoreSnapshotFromClientAfterSplittingRegions > I guess the main problem is: a dead lock between SplitTableRegionProcedure > and SnapshotProcedure.. > Attached logs from the failed UT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21559) The RestoreSnapshotFromClientTestBase related UT are flaky
[ https://issues.apache.org/jira/browse/HBASE-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711503#comment-16711503 ] Zheng Hu commented on HBASE-21559: -- bq. But they will not hold the map lock when modifying right? Assume the case: t1. start snapshot t2. hold the table x-lock t3. rease the table x-lock; t4. downgrade to slock because table is enabled; t5. start snapshot on RS... t6. SplitTableRegionProcedure start . t7. STRP hold the table s-lock t8. check isTakingSnapshot . Then at t8, the SnapshotManager may update the status of handler at any time , I think. > The RestoreSnapshotFromClientTestBase related UT are flaky > -- > > Key: HBASE-21559 > URL: https://issues.apache.org/jira/browse/HBASE-21559 > Project: HBase > Issue Type: Bug >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.2, 2.0.4, 2.0.5 > > Attachments: HBASE-21559.v1.patch, > TEST-org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.xml, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions-output.txt, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.txt > > > The related UT are: > * TestRestoreSnapshotFromClientAfterSplittingRegions > * TestRestoreSnapshotFromClientWithRegionReplicas > * TestMobRestoreSnapshotFromClientAfterSplittingRegions > I guess the main problem is: a dead lock between SplitTableRegionProcedure > and SnapshotProcedure.. > Attached logs from the failed UT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21559) The RestoreSnapshotFromClientTestBase related UT are flaky
[ https://issues.apache.org/jira/browse/HBASE-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711436#comment-16711436 ] Zheng Hu commented on HBASE-21559: -- bq. Do we really need to test the isFinished under the map lock? Just get it out and check? I thought better to do this, in case of we get a handler and someone update the handler's status right now? > The RestoreSnapshotFromClientTestBase related UT are flaky > -- > > Key: HBASE-21559 > URL: https://issues.apache.org/jira/browse/HBASE-21559 > Project: HBase > Issue Type: Bug >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.2, 2.0.4, 2.0.5 > > Attachments: HBASE-21559.v1.patch, > TEST-org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.xml, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions-output.txt, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.txt > > > The related UT are: > * TestRestoreSnapshotFromClientAfterSplittingRegions > * TestRestoreSnapshotFromClientWithRegionReplicas > * TestMobRestoreSnapshotFromClientAfterSplittingRegions > I guess the main problem is: a dead lock between SplitTableRegionProcedure > and SnapshotProcedure.. > Attached logs from the failed UT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21559) The RestoreSnapshotFromClientTestBase related UT are flaky
[ https://issues.apache.org/jira/browse/HBASE-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711430#comment-16711430 ] Duo Zhang commented on HBASE-21559: --- Do we really need to test the isFinished under the map lock? Just get it out and check? > The RestoreSnapshotFromClientTestBase related UT are flaky > -- > > Key: HBASE-21559 > URL: https://issues.apache.org/jira/browse/HBASE-21559 > Project: HBase > Issue Type: Bug >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.2, 2.0.4, 2.0.5 > > Attachments: HBASE-21559.v1.patch, > TEST-org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.xml, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions-output.txt, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.txt > > > The related UT are: > * TestRestoreSnapshotFromClientAfterSplittingRegions > * TestRestoreSnapshotFromClientWithRegionReplicas > * TestMobRestoreSnapshotFromClientAfterSplittingRegions > I guess the main problem is: a dead lock between SplitTableRegionProcedure > and SnapshotProcedure.. > Attached logs from the failed UT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21559) The RestoreSnapshotFromClientTestBase related UT are flaky
[ https://issues.apache.org/jira/browse/HBASE-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711439#comment-16711439 ] Duo Zhang commented on HBASE-21559: --- But they will not hold the map lock when modifying right? > The RestoreSnapshotFromClientTestBase related UT are flaky > -- > > Key: HBASE-21559 > URL: https://issues.apache.org/jira/browse/HBASE-21559 > Project: HBase > Issue Type: Bug >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.2, 2.0.4, 2.0.5 > > Attachments: HBASE-21559.v1.patch, > TEST-org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.xml, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions-output.txt, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.txt > > > The related UT are: > * TestRestoreSnapshotFromClientAfterSplittingRegions > * TestRestoreSnapshotFromClientWithRegionReplicas > * TestMobRestoreSnapshotFromClientAfterSplittingRegions > I guess the main problem is: a dead lock between SplitTableRegionProcedure > and SnapshotProcedure.. > Attached logs from the failed UT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21559) The RestoreSnapshotFromClientTestBase related UT are flaky
[ https://issues.apache.org/jira/browse/HBASE-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711429#comment-16711429 ] Zheng Hu commented on HBASE-21559: -- BTW, I think we can move the snapshot feature from procedure.v1 from procedure.v2 in the future. So I assigned the HBASE-14413 to myself. > The RestoreSnapshotFromClientTestBase related UT are flaky > -- > > Key: HBASE-21559 > URL: https://issues.apache.org/jira/browse/HBASE-21559 > Project: HBase > Issue Type: Bug >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.2, 2.0.4, 2.0.5 > > Attachments: HBASE-21559.v1.patch, > TEST-org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.xml, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions-output.txt, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.txt > > > The related UT are: > * TestRestoreSnapshotFromClientAfterSplittingRegions > * TestRestoreSnapshotFromClientWithRegionReplicas > * TestMobRestoreSnapshotFromClientAfterSplittingRegions > I guess the main problem is: a dead lock between SplitTableRegionProcedure > and SnapshotProcedure.. > Attached logs from the failed UT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21559) The RestoreSnapshotFromClientTestBase related UT are flaky
[ https://issues.apache.org/jira/browse/HBASE-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711425#comment-16711425 ] Zheng Hu commented on HBASE-21559: -- Run the UT in my localhost 5 times, Seems OK now. > The RestoreSnapshotFromClientTestBase related UT are flaky > -- > > Key: HBASE-21559 > URL: https://issues.apache.org/jira/browse/HBASE-21559 > Project: HBase > Issue Type: Bug >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.2, 2.0.4, 2.0.5 > > Attachments: HBASE-21559.v1.patch, > TEST-org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.xml, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions-output.txt, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.txt > > > The related UT are: > * TestRestoreSnapshotFromClientAfterSplittingRegions > * TestRestoreSnapshotFromClientWithRegionReplicas > * TestMobRestoreSnapshotFromClientAfterSplittingRegions > I guess the main problem is: a dead lock between SplitTableRegionProcedure > and SnapshotProcedure.. > Attached logs from the failed UT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21559) The RestoreSnapshotFromClientTestBase related UT are flaky
[ https://issues.apache.org/jira/browse/HBASE-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711397#comment-16711397 ] Zheng Hu commented on HBASE-21559: -- Currently, the snapshotManager grab the object lock in many method. This is a very rough way of locking. I think we should change the locking way of SnapshotManager , not just synchronized the big SnapshotManager object, but use a more concrete lock (in case of dead lock). Anyway , Let me fix this dead lock firstly. So upload a patch.v1. > The RestoreSnapshotFromClientTestBase related UT are flaky > -- > > Key: HBASE-21559 > URL: https://issues.apache.org/jira/browse/HBASE-21559 > Project: HBase > Issue Type: Bug >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.2, 2.0.4, 2.0.5 > > Attachments: HBASE-21559.v1.patch, > TEST-org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.xml, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions-output.txt, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.txt > > > The related UT are: > * TestRestoreSnapshotFromClientAfterSplittingRegions > * TestRestoreSnapshotFromClientWithRegionReplicas > * TestMobRestoreSnapshotFromClientAfterSplittingRegions > I guess the main problem is: a dead lock between SplitTableRegionProcedure > and SnapshotProcedure.. > Attached logs from the failed UT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21559) The RestoreSnapshotFromClientTestBase related UT are flaky
[ https://issues.apache.org/jira/browse/HBASE-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711212#comment-16711212 ] Duo Zhang commented on HBASE-21559: --- Oh we have some logic in SnapshotProcedure and SplitTableRegionProcedure and also MergeTableRegionProcedure to prevent splitting/merging happens when we are taking snapshot. Maybe there are something wrong in the logic? See HBASE-21480. > The RestoreSnapshotFromClientTestBase related UT are flaky > -- > > Key: HBASE-21559 > URL: https://issues.apache.org/jira/browse/HBASE-21559 > Project: HBase > Issue Type: Bug >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.2, 2.0.4, 2.0.5 > > Attachments: > TEST-org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.xml, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions-output.txt, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.txt > > > The related UT are: > * TestRestoreSnapshotFromClientAfterSplittingRegions > * TestRestoreSnapshotFromClientWithRegionReplicas > * TestMobRestoreSnapshotFromClientAfterSplittingRegions > I guess the main problem is: a dead lock between SplitTableRegionProcedure > and SnapshotProcedure.. > Attached logs from the failed UT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21559) The RestoreSnapshotFromClientTestBase related UT are flaky
[ https://issues.apache.org/jira/browse/HBASE-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711214#comment-16711214 ] Zheng Hu commented on HBASE-21559: -- Yeah, It's a dead lock, The SplitTableRegionProcedure grabed the table lock and waiting for grab the SnapshotManager object lock, while the SnapshotManager grab the SnapshotManager and waiting for the table lock ? The SplitTableRegionProcedure stack: {code} Thread 527 (PEWorker-1): State: BLOCKED Blocked count: 10 Waited count: 89 Blocked on org.apache.hadoop.hbase.master.snapshot.SnapshotManager@51c5c8d5 Blocked by 412 (RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=53736) Stack: org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isTakingSnapshot(SnapshotManager.java:423) org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.prepareSplitRegion(SplitTableRegionProcedure.java:470) org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.executeFromState(SplitTableRegionProcedure.java:244) org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.executeFromState(SplitTableRegionProcedure.java:97) org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:189) org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:965) org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1723) org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1462) org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1200(ProcedureExecutor.java:78) org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2039) {code} And the SnapshotManager trace: {code} Thread 412 (RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=53736): State: TIMED_WAITING Blocked count: 60 Waited count: 359 Stack: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) org.apache.hadoop.hbase.master.locking.LockManager$MasterLock.tryAcquire(LockManager.java:162) org.apache.hadoop.hbase.master.locking.LockManager$MasterLock.acquire(LockManager.java:123) org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.prepare(TakeSnapshotHandler.java:141) org.apache.hadoop.hbase.master.snapshot.EnabledTableSnapshotHandler.prepare(EnabledTableSnapshotHandler.java:60) org.apache.hadoop.hbase.master.snapshot.EnabledTableSnapshotHandler.prepare(EnabledTableSnapshotHandler.java:46) org.apache.hadoop.hbase.master.snapshot.SnapshotManager.snapshotTable(SnapshotManager.java:524) org.apache.hadoop.hbase.master.snapshot.SnapshotManager.snapshotEnabledTable(SnapshotManager.java:510) org.apache.hadoop.hbase.master.snapshot.SnapshotManager.takeSnapshotInternal(SnapshotManager.java:633) org.apache.hadoop.hbase.master.snapshot.SnapshotManager.takeSnapshot(SnapshotManager.java:570) org.apache.hadoop.hbase.master.MasterRpcServices.snapshot(MasterRpcServices.java:1502) org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java) org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413) org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130) org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) {code} > The RestoreSnapshotFromClientTestBase related UT are flaky > -- > > Key: HBASE-21559 > URL: https://issues.apache.org/jira/browse/HBASE-21559 > Project: HBase > Issue Type: Bug >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.2, 2.0.4, 2.0.5 > > Attachments: > TEST-org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.xml, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions-output.txt, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.txt > > > The related UT are: > * TestRestoreSnapshotFromClientAfterSplittingRegions > * TestRestoreSnapshotFromClientWithRegionReplicas > * TestMobRestoreSnapshotFromClientAfterSplittingRegions > I guess the main problem is: a dead lock between SplitTableRegionProcedure > and SnapshotProcedure.. > Attached logs from the failed UT. -- This messa