[ https://issues.apache.org/jira/browse/GEODE-10409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17602982#comment-17602982 ]
ASF subversion and git services commented on GEODE-10409: --------------------------------------------------------- Commit 0852113f1b8086203ffdd99bae1afa250c2eaa3e in geode's branch refs/heads/develop from WeijieEST [ https://gitbox.apache.org/repos/asf?p=geode.git;h=0852113f1b ] GEODE-10409: Fix rebalance load model missing collocated regions at s… (#7839) * GEODE-10409: Fix rebalance load model missing collocated regions at server startup Assume region A collocated with A1 and A2, and a is the leader region, when rebalance at startup, rebalance will happened after the 3 region collocation completed, generally this happened in region A2. And when calculate rebalance load model from view of region A2, only leader region A and A2 itself will be added to the model, this commit fix the issue and make A1 also be added to the model. * add test cases to test rebalance model and remove the static mock * change test case to avoid changing existing methods for testing * improve test case > Rebalance Model Missing Collocated Regions At Server Startup > ------------------------------------------------------------ > > Key: GEODE-10409 > URL: https://issues.apache.org/jira/browse/GEODE-10409 > Project: Geode > Issue Type: Bug > Reporter: Weijie Xu > Assignee: Weijie Xu > Priority: Major > Labels: needsTriage, pull-request-available > Attachments: server2.log, test.tar.gz > > > Following steps reproduce the issue: > Run the start.gfsh in the attached example, which configures a geode system > with a partitioned region, a gateway sender and a collocated region with the > partitioned region. So there are three regions totally, the leader region, > the collcated region and the queue region. > Then run the example code, which will source ~400M data and 5 times amount of > events into the system. > Then stop one of the server, and revoke the disk file of the server. > Then start the server, which will trigger a bucket recovery. > From the attached log line596, line598 and line5958, we can see that the > queue region is not included in the rebalance model, either in the data size > colum nor in the max size colum. > Then do a manual rebalance after the server is up, this time log shows the > queue region is added to the model.(line6010, line6012, lin6014 and line6028) > > The inconsistent behavior will lead to 2 negative results: > 1) Different result of rebalance between server startup phase and manual > trigger, startup rebalance tells everything is OK, rebalance finished, but > manual trigger rebalance tells space not enough since it included the queue > region into the model which has 5 times data size as the leader region. > 2) A dismatch between the rebalance model and the actual data being > rebalanced(Actually the queue region data is rebalanced although the region > is not included in the model at server startup phase). -- This message was sent by Atlassian Jira (v8.20.10#820010)