[ 
https://issues.apache.org/jira/browse/HBASE-25334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17249049#comment-17249049
 ] 

Xiaolin Ha commented on HBASE-25334:
------------------------------------

[~zhangduo] From the test result on branch-2, just like the image I uploaded 
before, after a new server is online, it may need a while for the group info 
manager to update the group info cache about this server. And balancer can run 
before the cached updated, no locks here to tell balancer the group info cached 
right now may be not correct, or it is partial.

> TestRSGroupsFallback.testFallback is flaky
> ------------------------------------------
>
>                 Key: HBASE-25334
>                 URL: https://issues.apache.org/jira/browse/HBASE-25334
>             Project: HBase
>          Issue Type: Test
>            Reporter: Xiaolin Ha
>            Assignee: Xiaolin Ha
>            Priority: Major
>         Attachments: 1607918235175-image.png, 
> image-2020-12-13-10-15-55-445.png
>
>
> Like in CI test results of PR [https://github.com/apache/hbase/pull/2699]
> failed UTs site is 
> [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2699/3/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-server.txt]
>  
> In this unit test, it checks if all table regions assigned after balance, and 
> then assert for the RS group of regions.
> But balance() uses aync move, and will throttle move regions, sleeping 
> between all the table regions are moved to its RSGroup.
> If waiting time is not longer than the region movement duration, the 
> assertion will be fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to