[ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16647007#comment-16647007
 ] 

Andrew Purtell commented on HBASE-21266:
----------------------------------------

Those test failures in precommit might be flakes, let me see if I can reproduce 
them. 

I ran split, merge, assignment, and balancer tests, including the tests in 
question, and am not seeing any issues. 

{noformat}
[INFO] Running org.apache.hadoop.hbase.util.TestMergeTable
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.85 s 
- in org.apache.hadoop.hbase.util.TestMergeTable
[INFO] Running org.apache.hadoop.hbase.util.TestMergeTool
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.995 s 
- in org.apache.hadoop.hbase.util.TestMergeTool
[INFO] Running org.apache.hadoop.hbase.util.TestRegionSplitter
[INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.962 s 
- in org.apache.hadoop.hbase.util.TestRegionSplitter
[INFO] Running org.apache.hadoop.hbase.util.TestRegionSplitCalculator
[INFO] Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.968 s 
- in org.apache.hadoop.hbase.util.TestRegionSplitCalculator
[INFO] Running org.apache.hadoop.hbase.wal.TestWALSplit
[INFO] Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 46.067 
s - in org.apache.hadoop.hbase.wal.TestWALSplit
[INFO] Running org.apache.hadoop.hbase.wal.TestWALSplitBoundedLogWriterCreation
[WARNING] Tests run: 33, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 
43.419 s - in org.apache.hadoop.hbase.wal.TestWALSplitBoundedLogWriterCreation
[INFO] Running org.apache.hadoop.hbase.wal.TestWALSplitCompressed
[INFO] Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 46.075 
s - in org.apache.hadoop.hbase.wal.TestWALSplitCompressed
[INFO] Running org.apache.hadoop.hbase.mapred.TestSplitTable
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.774 s 
- in org.apache.hadoop.hbase.mapred.TestSplitTable
[INFO] Running org.apache.hadoop.hbase.regionserver.TestRegionSplitPolicy
[INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.836 s 
- in org.apache.hadoop.hbase.regionserver.TestRegionSplitPolicy
[INFO] Running org.apache.hadoop.hbase.regionserver.TestCompactSplitThread
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 30.692 s 
- in org.apache.hadoop.hbase.regionserver.TestCompactSplitThread
[INFO] Running 
org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
[INFO] Tests run: 17, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 87.588 
s - in org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
[INFO] Running org.apache.hadoop.hbase.regionserver.TestSplitWalDataLoss
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.796 s 
- in org.apache.hadoop.hbase.regionserver.TestSplitWalDataLoss
[INFO] Running org.apache.hadoop.hbase.regionserver.TestSplitTransaction
[INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.14 s - 
in org.apache.hadoop.hbase.regionserver.TestSplitTransaction
[INFO] Running org.apache.hadoop.hbase.regionserver.TestRegionMergeTransaction
[INFO] Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 9.715 s 
- in org.apache.hadoop.hbase.regionserver.TestRegionMergeTransaction
[INFO] Running org.apache.hadoop.hbase.regionserver.TestZKLessSplitOnCluster
[INFO] Tests run: 17, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 71.721 
s - in org.apache.hadoop.hbase.regionserver.TestZKLessSplitOnCluster
[INFO] Running org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 59.082 s 
- in org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction
[INFO] Running org.apache.hadoop.hbase.regionserver.TestSplitLogWorker
[INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 24.689 s 
- in org.apache.hadoop.hbase.regionserver.TestSplitLogWorker
[INFO] Running org.apache.hadoop.hbase.regionserver.TestZKLessMergeOnCluster
[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 28.068 s 
- in org.apache.hadoop.hbase.regionserver.TestZKLessMergeOnCluster
[INFO] Running 
org.apache.hadoop.hbase.regionserver.TestRegionMergeTransactionOnCluster
[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 35.386 s 
- in org.apache.hadoop.hbase.regionserver.TestRegionMergeTransactionOnCluster
[INFO] Running org.apache.hadoop.hbase.master.TestAssignmentManagerOnCluster
[INFO] Tests run: 23, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 185.692 
s - in org.apache.hadoop.hbase.master.TestAssignmentManagerOnCluster
[INFO] Running org.apache.hadoop.hbase.master.TestDistributedLogSplitting
[WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 15, Time elapsed: 
90.971 s - in org.apache.hadoop.hbase.master.TestDistributedLogSplitting
[INFO] Running org.apache.hadoop.hbase.master.TestAssignmentListener
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 17.569 s 
- in org.apache.hadoop.hbase.master.TestAssignmentListener
[INFO] Running org.apache.hadoop.hbase.master.TestMasterBalanceThrottling
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 35.732 s 
- in org.apache.hadoop.hbase.master.TestMasterBalanceThrottling
[INFO] Running 
org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer2
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 382.057 
s - in org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer2
[INFO] Running 
org.apache.hadoop.hbase.master.balancer.TestBalancerStatusTagInJMXMetrics
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 12.546 s 
- in org.apache.hadoop.hbase.master.balancer.TestBalancerStatusTagInJMXMetrics
[INFO] Running 
org.apache.hadoop.hbase.master.balancer.TestFavoredNodeAssignmentHelper
[WARNING] Tests run: 8, Failures: 0, Errors: 0, Skipped: 8, Time elapsed: 0.053 
s - in org.apache.hadoop.hbase.master.balancer.TestFavoredNodeAssignmentHelper
[INFO] Running org.apache.hadoop.hbase.master.balancer.TestDefaultLoadBalancer
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.029 s 
- in org.apache.hadoop.hbase.master.balancer.TestDefaultLoadBalancer
[INFO] Running 
org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer
[WARNING] Tests run: 24, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 
194.946 s - in 
org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer
[INFO] Running org.apache.hadoop.hbase.master.balancer.TestBaseLoadBalancer
[INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.936 s 
- in org.apache.hadoop.hbase.master.balancer.TestBaseLoadBalancer
[INFO] Running org.apache.hadoop.hbase.master.TestSplitLogManager
[WARNING] Tests run: 15, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 
137.272 s - in org.apache.hadoop.hbase.master.TestSplitLogManager
[INFO] Running org.apache.hadoop.hbase.master.TestAssignmentManagerMetrics
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 21.514 s 
- in org.apache.hadoop.hbase.master.TestAssignmentManagerMetrics
[INFO] Running 
org.apache.hadoop.hbase.master.TestMasterFailoverBalancerPersistence
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 21.396 s 
- in org.apache.hadoop.hbase.master.TestMasterFailoverBalancerPersistence
[INFO] Running org.apache.hadoop.hbase.TestStochasticBalancerJmxMetrics
[WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0.019 
s - in org.apache.hadoop.hbase.TestStochasticBalancerJmxMetrics
[INFO] Running 
org.apache.hadoop.hbase.mapreduce.TestSecureLoadIncrementalHFilesSplitRecovery
[INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 75.973 s 
- in 
org.apache.hadoop.hbase.mapreduce.TestSecureLoadIncrementalHFilesSplitRecovery
[INFO] Running 
org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFilesSplitRecovery
[INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 80.267 s 
- in org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFilesSplitRecovery
[INFO] Running org.apache.hadoop.hbase.mapreduce.TestTableSplit
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.967 s 
- in org.apache.hadoop.hbase.mapreduce.TestTableSplit
[INFO] Running org.apache.hadoop.hbase.client.TestSplitOrMergeStatus
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 149.744 
s - in org.apache.hadoop.hbase.client.TestSplitOrMergeStatus
[INFO] 
[INFO] Results:
[INFO] 
[WARNING] Tests run: 344, Failures: 0, Errors: 0, Skipped: 27
{noformat}

> Not running balancer because processing dead regionservers, but empty dead rs 
> list
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-21266
>                 URL: https://issues.apache.org/jira/browse/HBASE-21266
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.4.8
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Major
>             Fix For: 1.5.0, 1.4.9
>
>         Attachments: HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch, 
> HBASE-21266-branch-1.patch, HBASE-21266-branch-1.patch
>
>
> Found during ITBLL testing. AM in master gets into a state where manual 
> attempts from the shell to run the balancer always return false and this is 
> printed in the master log:
> 2018-10-03 19:17:14,892 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: 
> Not running balancer because processing dead regionserver(s): 
> Note the empty list. 
> This errant state did not recover without intervention by way of master 
> restart, but the test environment was chaotic so needs investigation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to