[ https://issues.apache.org/jira/browse/HBASE-25334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17249049#comment-17249049 ]
Xiaolin Ha commented on HBASE-25334: ------------------------------------ [~zhangduo] From the test result on branch-2, just like the image I uploaded before, after a new server is online, it may need a while for the group info manager to update the group info cache about this server. And balancer can run before the cached updated, no locks here to tell balancer the group info cached right now may be not correct, or it is partial. > TestRSGroupsFallback.testFallback is flaky > ------------------------------------------ > > Key: HBASE-25334 > URL: https://issues.apache.org/jira/browse/HBASE-25334 > Project: HBase > Issue Type: Test > Reporter: Xiaolin Ha > Assignee: Xiaolin Ha > Priority: Major > Attachments: 1607918235175-image.png, > image-2020-12-13-10-15-55-445.png > > > Like in CI test results of PR [https://github.com/apache/hbase/pull/2699] > failed UTs site is > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2699/3/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-server.txt] > > In this unit test, it checks if all table regions assigned after balance, and > then assert for the RS group of regions. > But balance() uses aync move, and will throttle move regions, sleeping > between all the table regions are moved to its RSGroup. > If waiting time is not longer than the region movement duration, the > assertion will be fail. -- This message was sent by Atlassian Jira (v8.3.4#803005)