[ https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16647007#comment-16647007 ]
Andrew Purtell commented on HBASE-21266: ---------------------------------------- Those test failures in precommit might be flakes, let me see if I can reproduce them. I ran split, merge, assignment, and balancer tests, including the tests in question, and am not seeing any issues. {noformat} [INFO] Running org.apache.hadoop.hbase.util.TestMergeTable [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.85 s - in org.apache.hadoop.hbase.util.TestMergeTable [INFO] Running org.apache.hadoop.hbase.util.TestMergeTool [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.995 s - in org.apache.hadoop.hbase.util.TestMergeTool [INFO] Running org.apache.hadoop.hbase.util.TestRegionSplitter [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.962 s - in org.apache.hadoop.hbase.util.TestRegionSplitter [INFO] Running org.apache.hadoop.hbase.util.TestRegionSplitCalculator [INFO] Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.968 s - in org.apache.hadoop.hbase.util.TestRegionSplitCalculator [INFO] Running org.apache.hadoop.hbase.wal.TestWALSplit [INFO] Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 46.067 s - in org.apache.hadoop.hbase.wal.TestWALSplit [INFO] Running org.apache.hadoop.hbase.wal.TestWALSplitBoundedLogWriterCreation [WARNING] Tests run: 33, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 43.419 s - in org.apache.hadoop.hbase.wal.TestWALSplitBoundedLogWriterCreation [INFO] Running org.apache.hadoop.hbase.wal.TestWALSplitCompressed [INFO] Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 46.075 s - in org.apache.hadoop.hbase.wal.TestWALSplitCompressed [INFO] Running org.apache.hadoop.hbase.mapred.TestSplitTable [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.774 s - in org.apache.hadoop.hbase.mapred.TestSplitTable [INFO] Running org.apache.hadoop.hbase.regionserver.TestRegionSplitPolicy [INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.836 s - in org.apache.hadoop.hbase.regionserver.TestRegionSplitPolicy [INFO] Running org.apache.hadoop.hbase.regionserver.TestCompactSplitThread [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 30.692 s - in org.apache.hadoop.hbase.regionserver.TestCompactSplitThread [INFO] Running org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster [INFO] Tests run: 17, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 87.588 s - in org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster [INFO] Running org.apache.hadoop.hbase.regionserver.TestSplitWalDataLoss [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.796 s - in org.apache.hadoop.hbase.regionserver.TestSplitWalDataLoss [INFO] Running org.apache.hadoop.hbase.regionserver.TestSplitTransaction [INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.14 s - in org.apache.hadoop.hbase.regionserver.TestSplitTransaction [INFO] Running org.apache.hadoop.hbase.regionserver.TestRegionMergeTransaction [INFO] Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 9.715 s - in org.apache.hadoop.hbase.regionserver.TestRegionMergeTransaction [INFO] Running org.apache.hadoop.hbase.regionserver.TestZKLessSplitOnCluster [INFO] Tests run: 17, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 71.721 s - in org.apache.hadoop.hbase.regionserver.TestZKLessSplitOnCluster [INFO] Running org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 59.082 s - in org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction [INFO] Running org.apache.hadoop.hbase.regionserver.TestSplitLogWorker [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 24.689 s - in org.apache.hadoop.hbase.regionserver.TestSplitLogWorker [INFO] Running org.apache.hadoop.hbase.regionserver.TestZKLessMergeOnCluster [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 28.068 s - in org.apache.hadoop.hbase.regionserver.TestZKLessMergeOnCluster [INFO] Running org.apache.hadoop.hbase.regionserver.TestRegionMergeTransactionOnCluster [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 35.386 s - in org.apache.hadoop.hbase.regionserver.TestRegionMergeTransactionOnCluster [INFO] Running org.apache.hadoop.hbase.master.TestAssignmentManagerOnCluster [INFO] Tests run: 23, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 185.692 s - in org.apache.hadoop.hbase.master.TestAssignmentManagerOnCluster [INFO] Running org.apache.hadoop.hbase.master.TestDistributedLogSplitting [WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 15, Time elapsed: 90.971 s - in org.apache.hadoop.hbase.master.TestDistributedLogSplitting [INFO] Running org.apache.hadoop.hbase.master.TestAssignmentListener [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 17.569 s - in org.apache.hadoop.hbase.master.TestAssignmentListener [INFO] Running org.apache.hadoop.hbase.master.TestMasterBalanceThrottling [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 35.732 s - in org.apache.hadoop.hbase.master.TestMasterBalanceThrottling [INFO] Running org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer2 [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 382.057 s - in org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer2 [INFO] Running org.apache.hadoop.hbase.master.balancer.TestBalancerStatusTagInJMXMetrics [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 12.546 s - in org.apache.hadoop.hbase.master.balancer.TestBalancerStatusTagInJMXMetrics [INFO] Running org.apache.hadoop.hbase.master.balancer.TestFavoredNodeAssignmentHelper [WARNING] Tests run: 8, Failures: 0, Errors: 0, Skipped: 8, Time elapsed: 0.053 s - in org.apache.hadoop.hbase.master.balancer.TestFavoredNodeAssignmentHelper [INFO] Running org.apache.hadoop.hbase.master.balancer.TestDefaultLoadBalancer [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.029 s - in org.apache.hadoop.hbase.master.balancer.TestDefaultLoadBalancer [INFO] Running org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer [WARNING] Tests run: 24, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 194.946 s - in org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer [INFO] Running org.apache.hadoop.hbase.master.balancer.TestBaseLoadBalancer [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.936 s - in org.apache.hadoop.hbase.master.balancer.TestBaseLoadBalancer [INFO] Running org.apache.hadoop.hbase.master.TestSplitLogManager [WARNING] Tests run: 15, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 137.272 s - in org.apache.hadoop.hbase.master.TestSplitLogManager [INFO] Running org.apache.hadoop.hbase.master.TestAssignmentManagerMetrics [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 21.514 s - in org.apache.hadoop.hbase.master.TestAssignmentManagerMetrics [INFO] Running org.apache.hadoop.hbase.master.TestMasterFailoverBalancerPersistence [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 21.396 s - in org.apache.hadoop.hbase.master.TestMasterFailoverBalancerPersistence [INFO] Running org.apache.hadoop.hbase.TestStochasticBalancerJmxMetrics [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0.019 s - in org.apache.hadoop.hbase.TestStochasticBalancerJmxMetrics [INFO] Running org.apache.hadoop.hbase.mapreduce.TestSecureLoadIncrementalHFilesSplitRecovery [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 75.973 s - in org.apache.hadoop.hbase.mapreduce.TestSecureLoadIncrementalHFilesSplitRecovery [INFO] Running org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFilesSplitRecovery [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 80.267 s - in org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFilesSplitRecovery [INFO] Running org.apache.hadoop.hbase.mapreduce.TestTableSplit [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.967 s - in org.apache.hadoop.hbase.mapreduce.TestTableSplit [INFO] Running org.apache.hadoop.hbase.client.TestSplitOrMergeStatus [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 149.744 s - in org.apache.hadoop.hbase.client.TestSplitOrMergeStatus [INFO] [INFO] Results: [INFO] [WARNING] Tests run: 344, Failures: 0, Errors: 0, Skipped: 27 {noformat} > Not running balancer because processing dead regionservers, but empty dead rs > list > ---------------------------------------------------------------------------------- > > Key: HBASE-21266 > URL: https://issues.apache.org/jira/browse/HBASE-21266 > Project: HBase > Issue Type: Bug > Affects Versions: 1.4.8 > Reporter: Andrew Purtell > Assignee: Andrew Purtell > Priority: Major > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, > HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch > > > Found during ITBLL testing. AM in master gets into a state where manual > attempts from the shell to run the balancer always return false and this is > printed in the master log: > 2018-10-03 19:17:14,892 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: > Not running balancer because processing dead regionserver(s): > Note the empty list. > This errant state did not recover without intervention by way of master > restart, but the test environment was chaotic so needs investigation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)