[ https://issues.apache.org/jira/browse/HBASE-15192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15137606#comment-15137606 ]
Ted Yu commented on HBASE-15192: -------------------------------- If there is no objection, I plan to resolve this JIRA. There is still flaky subtest, e.g. : https://builds.apache.org/job/HBase-TRUNK_matrix/lastCompletedBuild/jdk=latest1.8,label=yahoo-not-h2/testReport/org.apache.hadoop.hbase.regionserver/TestRegionMergeTransactionOnCluster/testMergeWithReplicas/ which we can track in separate issue(s) > TestRegionMergeTransactionOnCluster#testCleanMergeReference is flaky > -------------------------------------------------------------------- > > Key: HBASE-15192 > URL: https://issues.apache.org/jira/browse/HBASE-15192 > Project: HBase > Issue Type: Test > Reporter: Ted Yu > Assignee: Ted Yu > Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15192.v1.patch, HBASE-15192.v2.patch > > > TestRegionMergeTransactionOnCluster#testCleanMergeReference fails > intermittently due to failed assertion on cleaned merge region count: > {code} > testCleanMergeReference(org.apache.hadoop.hbase.regionserver.TestRegionMergeTransactionOnCluster) > Time elapsed: 64.183 sec <<< FAILURE! > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hbase.regionserver.TestRegionMergeTransactionOnCluster.testCleanMergeReference(TestRegionMergeTransactionOnCluster.java:284) > {code} > Before calling CatalogJanitor#scan(), the test does: > {code} > int newcount1 = 0; > while (System.currentTimeMillis() < timeout) { > for(HColumnDescriptor colFamily : columnFamilies) { > newcount1 += hrfs.getStoreFiles(colFamily.getName()).size(); > } > if(newcount1 <= 1) { > break; > } > Thread.sleep(50); > } > {code} > newcount1 is not cleared at the beginning of the loop. > This means that if the check for newcount1 <= 1 doesn't pass the first > iteration, it wouldn't pass in subsequent iterations. > After timeout is exhausted, admin.runCatalogScan() is called. However, there > is a chance that CatalogJanitor#scan() has been called by the Chore already > (during the wait period), leaving the cleaned count 0 and failing the test. -- This message was sent by Atlassian JIRA (v6.3.4#6332)