Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/22867 )
Change subject: KUDU-1973: Send no-op heartbeat operations batched - PART 1 ...................................................................... Patch Set 12: (18 comments) http://gerrit.cloudera.org:8080/#/c/22867/12//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/22867/12//COMMIT_MSG@50 PS12, Line 50: tablets > tables? yes http://gerrit.cloudera.org:8080/#/c/22867/12//COMMIT_MSG@69 PS12, Line 69: 30 > Just curious -- what was the rationale behind having 30 as the highest numb Using 10 decreased the cpu usages with ~50%. Increasing it to 30 only saved additional 2-4% (or 4-8% if you compare it to the already halved time). I will check something higher, but I would be surprised if it would add significant further gains. http://gerrit.cloudera.org:8080/#/c/22867/12/src/kudu/consensus/consensus.proto File src/kudu/consensus/consensus.proto: http://gerrit.cloudera.org:8080/#/c/22867/12/src/kudu/consensus/consensus.proto@450 PS12, Line 450: e.g > nit: e.g. Done http://gerrit.cloudera.org:8080/#/c/22867/12/src/kudu/consensus/multi_raft_batcher.h File src/kudu/consensus/multi_raft_batcher.h: http://gerrit.cloudera.org:8080/#/c/22867/12/src/kudu/consensus/multi_raft_batcher.h@76 PS12, Line 76: uint64_t Subscribe(const PeriodicHeartbeater& heartbeater); : void Unsubscribe(uint64_t id); > Please document these, describing the semantics of the parameters as well. Added a longer description above the class. http://gerrit.cloudera.org:8080/#/c/22867/12/src/kudu/consensus/multi_raft_batcher.h@106 PS12, Line 106: next_id > nit: rename to comply with style guideline for field members (should be nex Done http://gerrit.cloudera.org:8080/#/c/22867/12/src/kudu/consensus/multi_raft_batcher.h@109 PS12, Line 109: MonoTime time; : uint64_t id; > It would be nice to document the fields and the structure overall. Done http://gerrit.cloudera.org:8080/#/c/22867/12/src/kudu/consensus/multi_raft_batcher.h@112 PS12, Line 112: fix Dealay > fixed delay? Done http://gerrit.cloudera.org:8080/#/c/22867/12/src/kudu/consensus/multi_raft_batcher.h@116 PS12, Line 116: int flush_interval_; > Why not to store this as MonoDelta? Done http://gerrit.cloudera.org:8080/#/c/22867/12/src/kudu/consensus/multi_raft_batcher.h@118 PS12, Line 118: void StartTimer(); > style nit for here and elsewhere: don't intersperse method with fields -- f Done http://gerrit.cloudera.org:8080/#/c/22867/12/src/kudu/consensus/multi_raft_batcher.h@154 PS12, Line 154: int flush_interval_; > Why not to store it as MonoDelta? Done http://gerrit.cloudera.org:8080/#/c/22867/12/src/kudu/consensus/multi_raft_batcher.cc File src/kudu/consensus/multi_raft_batcher.cc: http://gerrit.cloudera.org:8080/#/c/22867/12/src/kudu/consensus/multi_raft_batcher.cc@57 PS12, Line 57: DEFINE_int32(multi_raft_heartbeat_interval_ms, : 100, : "The heartbeat interval for batch Raft replication."); > Is there any meaningful invariant between --multi_raft_heartbeat_interval_m I changed its name from interval to window. It is actually forced to be at most raft_heartbeat_interval_ms/2 (See GetFlushInterval()). I added some description. http://gerrit.cloudera.org:8080/#/c/22867/12/src/kudu/consensus/multi_raft_batcher.cc@90 PS12, Line 90: DCHECK > nit: DCHECK_EQ might be more expressive Done http://gerrit.cloudera.org:8080/#/c/22867/12/src/kudu/consensus/multi_raft_batcher.cc@120 PS12, Line 120: <std::mutex> > nit for here and elsewhere: in C++17 the compiler is able to deduce the typ Done http://gerrit.cloudera.org:8080/#/c/22867/12/src/kudu/consensus/multi_raft_batcher.cc@128 PS12, Line 128: size() > nit: using empty() for this is usually preferred due to better expression a Done http://gerrit.cloudera.org:8080/#/c/22867/12/src/kudu/consensus/multi_raft_batcher.cc@135 PS12, Line 135: MonoDelta::FromMilliseconds(FLAGS_raft_heartbeat_interval_ms); > nit: create an const instance of MonoDelta once outside of the loop, and us Done http://gerrit.cloudera.org:8080/#/c/22867/12/src/kudu/consensus/multi_raft_batcher.cc@138 PS12, Line 138: current_calls.emplace_back(peer_it->second, front.id); > Could it happen that the size of the resulted RPC goes over the limit defin It was split in SendOutScheduled. I moved the spliting here. http://gerrit.cloudera.org:8080/#/c/22867/12/src/kudu/consensus/multi_raft_batcher.cc@239 PS12, Line 239: batcher = res->second.lock() > BTW, under what conditions the weak pointer stored in the hash map couldn't Added comment http://gerrit.cloudera.org:8080/#/c/22867/12/src/kudu/consensus/multi_raft_batcher.cc@238 PS12, Line 238: auto res = batchers_.find(hostport); : if (res != batchers_.end() > nit for here and elsewhere: consider using helper from gutil/map-util.h (e. Done -- To view, visit http://gerrit.cloudera.org:8080/22867 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie92ba4de5eae00d56cd513cb644dce8fb6e14538 Gerrit-Change-Number: 22867 Gerrit-PatchSet: 12 Gerrit-Owner: Zoltan Martonka <[email protected]> Gerrit-Reviewer: Abhishek Chennaka <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Attila Bukor <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Marton Greber <[email protected]> Gerrit-Reviewer: Zoltan Chovan <[email protected]> Gerrit-Reviewer: Zoltan Martonka <[email protected]> Gerrit-Comment-Date: Tue, 12 Aug 2025 15:58:59 +0000 Gerrit-HasComments: Yes
