Mahesh Reddy has posted comments on this change. ( http://gerrit.cloudera.org:8080/19931 )
Change subject: KUDU-3476: Make replica placement range and table aware ...................................................................... Patch Set 12: (8 comments) http://gerrit.cloudera.org:8080/#/c/19931/11//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19931/11//COMMIT_MSG@25 PS11, Line 25: controls whether or not the replica placement is range aware. > Just for reference: the following changelist provides a proper link to the Shared the correct public facing design doc and sent doc to mailing list. http://gerrit.cloudera.org:8080/#/c/19931/10/src/kudu/integration-tests/create-table-itest.cc File src/kudu/integration-tests/create-table-itest.cc: http://gerrit.cloudera.org:8080/#/c/19931/10/src/kudu/integration-tests/create-table-itest.cc@484 PS10, Line 484: ASSERT_EQ(range_bounds.size(), table.second.size()); : for (const auto& ranges : table.second) { : // Since there are ten buckets instead of five for table "table1", : // we expect twice as many rep > nit: Maybe add a comment on how did we arrive at these numbers? Changed the values from being hardcoded to better understand http://gerrit.cloudera.org:8080/#/c/19931/10/src/kudu/master/master.proto File src/kudu/master/master.proto: http://gerrit.cloudera.org:8080/#/c/19931/10/src/kudu/master/master.proto@338 PS10, Line 338: // The number of tablets that are BOOTSTRAPPING or RUNNING per range. > Done Done http://gerrit.cloudera.org:8080/#/c/19931/10/src/kudu/master/master_service.cc File src/kudu/master/master_service.cc: http://gerrit.cloudera.org:8080/#/c/19931/10/src/kudu/master/master_service.cc@459 PS10, Line 459: req->num_live_tablets_by_dimension().e > Maybe add a todo if planning to address in future. Done http://gerrit.cloudera.org:8080/#/c/19931/11/src/kudu/master/placement_policy-test.cc File src/kudu/master/placement_policy-test.cc: http://gerrit.cloudera.org:8080/#/c/19931/11/src/kudu/master/placement_policy-test.cc@1066 PS11, Line 1066: ASSERT_LE(748 > Curious as to how did we arrive at this range [748, 752]? In most cases it should be 750, but there's some flakiness so to safeguard against that I expanded the range a little. http://gerrit.cloudera.org:8080/#/c/19931/11/src/kudu/master/placement_policy-test.cc@1441 PS11, Line 1441: 3110 > nit: Should this be 3110? Also curious on what causes the deviation here fr Correct, it should be 3110, I fixed that. For some of the test cases, there will be some flakiness as it's not expected or important for us to see the same value each time. As long as it's in a range near the expected value, it works fine. As to what causes the flakiness, the recentReplicaCreation values by range and table are eventually decayed back to 0 so it can be a result of the decay and the initial load on these servers. The decay won't be the same each time, it depends on how long it takes to place replicas on each tserver, so that varies run to run so sometimes we'll see a slight deviation. http://gerrit.cloudera.org:8080/#/c/19931/10/src/kudu/master/ts_descriptor.cc File src/kudu/master/ts_descriptor.cc: http://gerrit.cloudera.org:8080/#/c/19931/10/src/kudu/master/ts_descriptor.cc@214 PS10, Line 214: const double kHalflifeSecs = FLAGS_tserver_last_replica_creations_halflife_ms / 1000; : const MonoTime now = MonoTime::Now(); : // If map for the table or range hasn't been initialized yet, use init_time_ as last decay. : MonoTime last_decay = : last_replica_decay_by_range_.find(table_id) == last_replica_decay_by_range_.end() || : last_replica_decay_by_range_[table_id].first.find(range_start_key) == : last_replica_decay_by_range_[table_id].first.end() ? : init_time_ : last_replica_decay_by_range_[table_id].first[range_start_key]; : double secs_since_last_decay = (now - last_decay).To > Probably, I haven't properly articulated my question. I was curious exactl The purpose of decay in this context is to eventually "reset" these values to zero. Without the decay, the recentReplicaCreation values for ranges/table would just continue to increase. Since we're only interested in the number of replica creations recently, it makes sense for these values to reset to zero since eventually the tablet servers would report the number of live replicas by range/table. We only need the values for recently created replicas while the servers wouldn't report the values, so after let's say a creating a table and all the replicas are placed, it's safe to decay the values back to 0 as the tablet servers would reflect the newly placed replicas through GetNumLiveTabletsByRangePerTable(). http://gerrit.cloudera.org:8080/#/c/19931/10/src/kudu/tserver/ts_tablet_manager.h File src/kudu/tserver/ts_tablet_manager.h: http://gerrit.cloudera.org:8080/#/c/19931/10/src/kudu/tserver/ts_tablet_manager.h@222 PS10, Line 222: umLiveTabletsByRangePerT > Maybe this will help: Thanks Abhishek, including replicas in an INTIIALIZED state makes sense to me. The states FAILED, STOPPING, AND STOPPED probably shouldn't be included as tombstoned replicas don't service write/read requests anyways. Let me know what you guys think about adding INITIALIZED state and not the others. Should I add it to the two methods above, GetNumLiveTablets() and GetNumLiveTabletsByDimension()? -- To view, visit http://gerrit.cloudera.org:8080/19931 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9caeb8d5547e946bfeb152a99e1ec034c3fa0a0f Gerrit-Change-Number: 19931 Gerrit-PatchSet: 12 Gerrit-Owner: Mahesh Reddy <mre...@cloudera.com> Gerrit-Reviewer: Abhishek Chennaka <achenn...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <ale...@apache.org> Gerrit-Reviewer: Ashwani Raina <ara...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy <mre...@cloudera.com> Gerrit-Reviewer: Marton Greber <greber...@gmail.com> Gerrit-Reviewer: Zoltan Chovan <zcho...@cloudera.com> Gerrit-Reviewer: Ádám Bakai <aba...@cloudera.com> Gerrit-Comment-Date: Fri, 30 Jun 2023 00:22:39 +0000 Gerrit-HasComments: Yes