[ https://issues.apache.org/jira/browse/HBASE-19989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379583#comment-16379583 ]
Andrew Purtell commented on HBASE-19989: ---------------------------------------- I did a clean and fresh checkout on branch-1, now timeouts [ERROR] testSplitIsRolledBackOnSplitFailure(org.apache.hadoop.hbase.regionserver.TestZKLessSplitOnCluster) Time elapsed: 60.072 s <<< ERROR! org.junit.runners.model.TestTimedOutException: test timed out after 60000 milliseconds [ERROR] testSplitFailedCompactionAndSplit(org.apache.hadoop.hbase.regionserver.TestZKLessSplitOnCluster) Time elapsed: 60.07 s <<< ERROR! org.junit.runners.model.TestTimedOutException: test timed out after 60000 milliseconds [ERROR] testRITStateForRollback(org.apache.hadoop.hbase.regionserver.TestZKLessSplitOnCluster) Time elapsed: 60.069 s <<< ERROR! org.junit.runners.model.TestTimedOutException: test timed out after 60000 milliseconds [ERROR] testTableExistsIfTheSpecifiedTableRegionIsSplitParent(org.apache.hadoop.hbase.regionserver.TestZKLessSplitOnCluster) Time elapsed: 60.069 s <<< ERROR! org.junit.runners.model.TestTimedOutException: test timed out after 60000 milliseconds [ERROR] testFailedSplit(org.apache.hadoop.hbase.regionserver.TestZKLessSplitOnCluster) Time elapsed: 64.041 s <<< FAILURE! junit.framework.AssertionFailedError: Waiting timed out after [60,000] msec > READY_TO_MERGE and READY_TO_SPLIT do not update region state correctly > ---------------------------------------------------------------------- > > Key: HBASE-19989 > URL: https://issues.apache.org/jira/browse/HBASE-19989 > Project: HBase > Issue Type: Bug > Affects Versions: 1.3.1, 1.4.1 > Reporter: Ben Lau > Assignee: Ben Lau > Priority: Major > Fix For: 1.3.2, 1.5.0, 1.4.3 > > Attachments: HBASE-19989-branch-1.patch > > > Region state transitions do not work correctly for READY_TO_MERGE/SPLIT. > [~thiruvel] and I noticed this is due to break statements being in the wrong > place in AssignmentManager. This allows a race condition for example in > which one of the regions being merged could be moved concurrently, resulting > in the merge transaction failing and then double assignment and/or dataloss. > This bug appears to only affect branch-1 (for example 1.3 and 1.4) and not > branch-2 as the relevant code in AM has since been rewritten. -- This message was sent by Atlassian JIRA (v7.6.3#76005)