[kudu-CR] consensus peers: replace bespoke Raft heartbeat logic with periodic timers
Adar Dembo has submitted this change and it was merged. Change subject: consensus_peers: replace bespoke Raft heartbeat logic with periodic timers .. consensus_peers: replace bespoke Raft heartbeat logic with periodic timers Building on the generic periodic timer implementation, this patch replaces the one-off Raft heartbeating code with periodic timers. There's only one semantic change, but it's an important one: the jittering range has changed from [P/2, P] to [3P/4, 5P/4]. When I wrote commit 1070e76 I was nervous about making such a change, but since it reduces overall heartbeat load, I think it makes sense to do it. Change-Id: I5f7e1761d9f36dc6a25bd8e3e7d7a3b5c402afbf Reviewed-on: http://gerrit.cloudera.org:8080/7734 Reviewed-by: Dan Burkert Tested-by: Kudu Jenkins --- M src/kudu/consensus/consensus_peers.cc M src/kudu/consensus/consensus_peers.h M src/kudu/consensus/raft_consensus_quorum-test.cc 3 files changed, 24 insertions(+), 90 deletions(-) Approvals: Dan Burkert: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/7734 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I5f7e1761d9f36dc6a25bd8e3e7d7a3b5c402afbf Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] consensus peers: replace bespoke Raft heartbeat logic with periodic timers
Dan Burkert has posted comments on this change. Change subject: consensus_peers: replace bespoke Raft heartbeat logic with periodic timers .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7734 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5f7e1761d9f36dc6a25bd8e3e7d7a3b5c402afbf Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] consensus peers: replace bespoke Raft heartbeat logic with periodic timers
Dan Burkert has posted comments on this change. Change subject: consensus_peers: replace bespoke Raft heartbeat logic with periodic timers .. Patch Set 5: flake looked like KUDU-1736 so I retriggered it -- To view, visit http://gerrit.cloudera.org:8080/7734 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5f7e1761d9f36dc6a25bd8e3e7d7a3b5c402afbf Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] consensus peers: replace bespoke Raft heartbeat logic with periodic timers
Dan Burkert has posted comments on this change. Change subject: consensus_peers: replace bespoke Raft heartbeat logic with periodic timers .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7734 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5f7e1761d9f36dc6a25bd8e3e7d7a3b5c402afbf Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] consensus peers: replace bespoke Raft heartbeat logic with periodic timers
Hello Todd Lipcon, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7734 to look at the new patch set (#4). Change subject: consensus_peers: replace bespoke Raft heartbeat logic with periodic timers .. consensus_peers: replace bespoke Raft heartbeat logic with periodic timers Building on the generic periodic timer implementation, this patch replaces the one-off Raft heartbeating code with periodic timers. There's only one semantic change, but it's an important one: the jittering range has changed from [P/2, P] to [3P/4, 5P/4]. When I wrote commit 1070e76 I was nervous about making such a change, but since it reduces overall heartbeat load, I think it makes sense to do it. Change-Id: I5f7e1761d9f36dc6a25bd8e3e7d7a3b5c402afbf --- M src/kudu/consensus/consensus_peers.cc M src/kudu/consensus/consensus_peers.h M src/kudu/consensus/raft_consensus_quorum-test.cc 3 files changed, 24 insertions(+), 90 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/7734/4 -- To view, visit http://gerrit.cloudera.org:8080/7734 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5f7e1761d9f36dc6a25bd8e3e7d7a3b5c402afbf Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] consensus peers: replace bespoke Raft heartbeat logic with periodic timers
Todd Lipcon has posted comments on this change. Change subject: consensus_peers: replace bespoke Raft heartbeat logic with periodic timers .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7734 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5f7e1761d9f36dc6a25bd8e3e7d7a3b5c402afbf Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] consensus peers: replace bespoke Raft heartbeat logic with periodic timers
Adar Dembo has posted comments on this change. Change subject: consensus_peers: replace bespoke Raft heartbeat logic with periodic timers .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/7734/2//COMMIT_MSG Commit Message: Line 15: heartbeat load, I think it makes sense to do it. > perhaps worth re-running the experiment from https://docs.google.com/docume Sure, I'll do that post-commit. http://gerrit.cloudera.org:8080/#/c/7734/2/src/kudu/consensus/consensus_peers.cc File src/kudu/consensus/consensus_peers.cc: Line 140: > I think std::function has an implicit constructor from any callable (seriou Thanks, fixed. -- To view, visit http://gerrit.cloudera.org:8080/7734 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5f7e1761d9f36dc6a25bd8e3e7d7a3b5c402afbf Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] consensus peers: replace bespoke Raft heartbeat logic with periodic timers
Hello Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7734 to look at the new patch set (#3). Change subject: consensus_peers: replace bespoke Raft heartbeat logic with periodic timers .. consensus_peers: replace bespoke Raft heartbeat logic with periodic timers Building on the generic periodic timer implementation, this patch replaces the one-off Raft heartbeating code with periodic timers. There's only one semantic change, but it's an important one: the jittering range has changed from [P/2, P] to [3P/4, 5P/4]. When I wrote commit 1070e76 I was nervous about making such a change, but since it reduces overall heartbeat load, I think it makes sense to do it. Change-Id: I5f7e1761d9f36dc6a25bd8e3e7d7a3b5c402afbf --- M src/kudu/consensus/consensus_peers.cc M src/kudu/consensus/consensus_peers.h 2 files changed, 24 insertions(+), 89 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/7734/3 -- To view, visit http://gerrit.cloudera.org:8080/7734 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5f7e1761d9f36dc6a25bd8e3e7d7a3b5c402afbf Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] consensus peers: replace bespoke Raft heartbeat logic with periodic timers
Todd Lipcon has posted comments on this change. Change subject: consensus_peers: replace bespoke Raft heartbeat logic with periodic timers .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/7734/2/src/kudu/consensus/consensus_peers.cc File src/kudu/consensus/consensus_peers.cc: Line 140: std::bind([w]() { > oh sure, if the task abstraction has a constructor for that (I haven't chec I think std::function has an implicit constructor from any callable (serious template magic) -- To view, visit http://gerrit.cloudera.org:8080/7734 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5f7e1761d9f36dc6a25bd8e3e7d7a3b5c402afbf Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] consensus peers: replace bespoke Raft heartbeat logic with periodic timers
Dan Burkert has posted comments on this change. Change subject: consensus_peers: replace bespoke Raft heartbeat logic with periodic timers .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/7734/2/src/kudu/consensus/consensus_peers.cc File src/kudu/consensus/consensus_peers.cc: Line 140: std::bind([w]() { > oh, good catch. shouldn't you be able to just pass the lambda directly anyw oh sure, if the task abstraction has a constructor for that (I haven't checked). -- To view, visit http://gerrit.cloudera.org:8080/7734 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5f7e1761d9f36dc6a25bd8e3e7d7a3b5c402afbf Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] consensus peers: replace bespoke Raft heartbeat logic with periodic timers
Todd Lipcon has posted comments on this change. Change subject: consensus_peers: replace bespoke Raft heartbeat logic with periodic timers .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/7734/2/src/kudu/consensus/consensus_peers.cc File src/kudu/consensus/consensus_peers.cc: Line 140: std::bind([w]() { > I think it would be more clear to create a std::function or whatever type d oh, good catch. shouldn't you be able to just pass the lambda directly anyway, with no binding? -- To view, visit http://gerrit.cloudera.org:8080/7734 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5f7e1761d9f36dc6a25bd8e3e7d7a3b5c402afbf Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] consensus peers: replace bespoke Raft heartbeat logic with periodic timers
Dan Burkert has posted comments on this change. Change subject: consensus_peers: replace bespoke Raft heartbeat logic with periodic timers .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/7734/2/src/kudu/consensus/consensus_peers.cc File src/kudu/consensus/consensus_peers.cc: Line 140: std::bind([w]() { I think it would be more clear to create a std::function or whatever type directly instead of an indirectly through a no arg bind. -- To view, visit http://gerrit.cloudera.org:8080/7734 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5f7e1761d9f36dc6a25bd8e3e7d7a3b5c402afbf Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] consensus peers: replace bespoke Raft heartbeat logic with periodic timers
Todd Lipcon has posted comments on this change. Change subject: consensus_peers: replace bespoke Raft heartbeat logic with periodic timers .. Patch Set 2: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/7734/2//COMMIT_MSG Commit Message: Line 15: heartbeat load, I think it makes sense to do it. perhaps worth re-running the experiment from https://docs.google.com/document/d/1066W63e2YUTNnecmfRwgAHghBPnL1Pte_gJYAaZ_Bjo/edit# to make sure the new jitter parameters don't regress the stabilization after a failure. I think it's fine to do that after commit, though, since you did the small-scale experiment that you commented on in this gerrit. -- To view, visit http://gerrit.cloudera.org:8080/7734 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5f7e1761d9f36dc6a25bd8e3e7d7a3b5c402afbf Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] consensus peers: replace bespoke Raft heartbeat logic with periodic timers
Adar Dembo has posted comments on this change. Change subject: consensus_peers: replace bespoke Raft heartbeat logic with periodic timers .. Patch Set 2: I reran Todd's experiments from https://gerrit.cloudera.org/#/c/7331. With this patch series: leader thread count: 122 512 heartbeats/sec received by each replica (based on replica RPC metrics) 2025 voluntary ctx switch/sec on leader (based on metrics) 214 ms/sec user CPU on leader (based on metrics) 78 ms/sec system CPU on leader (based on metrics) perf stat -I1000 on leader as sanity check: 5.003299495 205.429271 task-clock 5.003299495 5,654 context-switches 5.003299495 65 cpu-migrations 5.003299495 0 page-faults 5.003299495380,464,010 cycles 5.003299495125,979,277 instructions 5.003299495 23,021,050 branches 5.003299495 1,442,498 branch-misses Couple interesting things: 1. As noted in the commit description, the average heartbeat period has changed again. It's slightly longer, which accounts for the drop from 638 hb/s to 512 (~25%). 2. The thread count is much lower because I also tested with the fd consolidation patch enabled. 3. Same goes for voluntary ctx switch rate, which I assume has changed for the same reason. I also repeated the "kill -STOP, sleep 5s, kill -CONT" to make sure the heartbeats remain jittered. You can see a screenshot of the trace here: https://ibb.co/h1agK5. -- To view, visit http://gerrit.cloudera.org:8080/7734 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5f7e1761d9f36dc6a25bd8e3e7d7a3b5c402afbf Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] consensus peers: replace bespoke Raft heartbeat logic with periodic timers
Adar Dembo has posted comments on this change. Change subject: consensus_peers: replace bespoke Raft heartbeat logic with periodic timers .. Patch Set 2: Verified+1 Overriding Jenkins, TSAN failure in raft_consensus-itest was another instance of KUDU-2059. -- To view, visit http://gerrit.cloudera.org:8080/7734 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5f7e1761d9f36dc6a25bd8e3e7d7a3b5c402afbf Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] consensus peers: replace bespoke Raft heartbeat logic with periodic timers
Hello David Ribeiro Alves, Mike Percy, Todd Lipcon, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/7734 to review the following change. Change subject: consensus_peers: replace bespoke Raft heartbeat logic with periodic timers .. consensus_peers: replace bespoke Raft heartbeat logic with periodic timers Building on the generic periodic timer implementation, this patch replaces the one-off Raft heartbeating code with periodic timers. There's only one semantic change, but it's an important one: the jittering range has changed from [P/2, P] to [3P/4, 5P/4]. When I wrote commit 1070e76 I was nervous about making such a change, but since it reduces overall heartbeat load, I think it makes sense to do it. Change-Id: I5f7e1761d9f36dc6a25bd8e3e7d7a3b5c402afbf --- M src/kudu/consensus/consensus_peers.cc M src/kudu/consensus/consensus_peers.h 2 files changed, 24 insertions(+), 90 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/7734/1 -- To view, visit http://gerrit.cloudera.org:8080/7734 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I5f7e1761d9f36dc6a25bd8e3e7d7a3b5c402afbf Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon