Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8138 )

Change subject: WIP [consensus] adding/removing NON_VOTER members
......................................................................


Patch Set 7:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/leader_election.cc
File src/kudu/consensus/leader_election.cc:

http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/leader_election.cc@167
PS6, Line 167:         LOG_WITH_PREFIX(WARNING) << Substitute(
> I thought about that, but decided that would be too much: if due do some bu
Oh, I take it back.  It seemed to me the RequestVotePB parameter arrives from 
some other peer, but that's not the case: RequestVotePB is generated by current 
peer and passed as a parameter into this constructor.

So, after some consideration I think DCHECK() would be the best option here.


http://gerrit.cloudera.org:8080/#/c/8138/7/src/kudu/consensus/leader_election.cc
File src/kudu/consensus/leader_election.cc:

http://gerrit.cloudera.org:8080/#/c/8138/7/src/kudu/consensus/leader_election.cc@166
PS7, Line 166:         // Only voters should start leader elections.
> should this be a DCHECK instead of just a warning? wouldn't it indicate a b
Right, Mike pointed at that already, but it seemed to me that this 
LeaderElection() constructor was called not only to start an election, but also 
to process VoteRequestPB when such a request arrives from a peer.  I found it 
was wrong, so I'm adding that DCHECK() here.


http://gerrit.cloudera.org:8080/#/c/8138/7/src/kudu/consensus/leader_election.cc@189
PS7, Line 189: Voters
> nit: "Voter UUIDs" reads better for me
Done


http://gerrit.cloudera.org:8080/#/c/8138/7/src/kudu/consensus/quorum_util-test.cc
File src/kudu/consensus/quorum_util-test.cc:

http://gerrit.cloudera.org:8080/#/c/8138/7/src/kudu/consensus/quorum_util-test.cc@33
PS7, Line 33: The membership type
            : // is optional: if not specified, the membership type is set to 
VOTER
> it doesn't look optional to me
Yep, the comment is stale.  Fixed.


http://gerrit.cloudera.org:8080/#/c/8138/7/src/kudu/consensus/quorum_util-test.cc@173
PS7, Line 173: nmi_peer_uuid
> what does 'nmi' stand for?
nmi -- 'no member_type info'.  agreed, that's not expressive one, will update.


http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/raft_consensus.cc@410
PS6, Line 410:       DisableFailureDetector();
> I'm not sure about that.
Ah, regarding DisableFailureDetector() I think you are right -- that might be 
better.

The second part of my comment was about returning IllegalState() status, 
actually.


http://gerrit.cloudera.org:8080/#/c/8138/7/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

http://gerrit.cloudera.org:8080/#/c/8138/7/src/kudu/consensus/raft_consensus.cc@405
PS7, Line 405:     if (PREDICT_FALSE(active_role != RaftPeerPB::LEADER &&
> isn't this condition guaranteed to be true because of the if statement just
Right, but it's about voter-related role in general.  I replaced that with 
IsVoterRole() to be less confusing.


http://gerrit.cloudera.org:8080/#/c/8138/7/src/kudu/consensus/raft_consensus.cc@405
PS7, Line 405: active_role != RaftPeerPB::LEADER &&
             :                       active_role != RaftPeerPB::FOLLOWER
> what about adding a utility function like DoesRoleVote(active_role) in quor
That's a good idea.  Done.


http://gerrit.cloudera.org:8080/#/c/8138/7/src/kudu/consensus/raft_consensus.cc@410
PS7, Line 410:       DisableFailureDetector();
> I'm not sure this line is necessary. Shouldn't we have never enabled it in
Since we start the failure detection upon start of every replica, it's 
necessary to do that somewhere.  I didn't find the place where things are done 
in a centralized way upon configuration change.

OK, I need to introduce that for generic configuration change.

No tests failed when removing this, but that's expected since in that case 
StartElection() return an error and in the log there are multiple messages like 
'failed to trigger leader election: Illegal state: only voting members can 
start elections'.


http://gerrit.cloudera.org:8080/#/c/8138/7/src/kudu/integration-tests/raft_consensus-itest.cc
File src/kudu/integration-tests/raft_consensus-itest.cc:

http://gerrit.cloudera.org:8080/#/c/8138/7/src/kudu/integration-tests/raft_consensus-itest.cc@3286
PS7, Line 3286: tablet cluster
> I think just 'tablet' is clearer
Done


http://gerrit.cloudera.org:8080/#/c/8138/7/src/kudu/integration-tests/raft_consensus-itest.cc@3293
PS7, Line 3293: it
> nit: its
Done


http://gerrit.cloudera.org:8080/#/c/8138/7/src/kudu/integration-tests/raft_consensus-itest.cc@3296
PS7, Line 3296: //   * Tablet leader is established and it reports about the 
newly added replica
> isn't this already done as soon as the config change is committed? the comp
Right, committing Raft configuration update means the leader has been already 
established.


http://gerrit.cloudera.org:8080/#/c/8138/7/src/kudu/integration-tests/raft_consensus-itest.cc@3299
PS7, Line 3299: and
> 'a new one'
Done


http://gerrit.cloudera.org:8080/#/c/8138/7/src/kudu/integration-tests/raft_consensus-itest.cc@3300
PS7, Line 3300: tablet cluster is available
> not clear what this means. You mean that the 'updated leader is available a
as the sentence explains, that means it's possible to insert data into 
corresponding table.  Updated.


http://gerrit.cloudera.org:8080/#/c/8138/7/src/kudu/integration-tests/raft_consensus-itest.cc@3301
PS7, Line 3301: is hosted by
> contains
Done


http://gerrit.cloudera.org:8080/#/c/8138/7/src/kudu/integration-tests/raft_consensus-itest.cc@3378
PS7, Line 3378: GetTabletCopyTgtSessionsCount
> let's not abbreviate 'Target' here
Done


http://gerrit.cloudera.org:8080/#/c/8138/7/src/kudu/integration-tests/raft_consensus-itest.cc@3410
PS7, Line 3410:     ASSERT_OK(LeaderStepDown(leader, tablet_id, kTimeout));
> add a comment for this block here explaining why you're asking it to step d
Done.  Yep, that's about checking that the update cluster is able to elect a 
leader.


http://gerrit.cloudera.org:8080/#/c/8138/7/src/kudu/integration-tests/raft_consensus-itest.cc@3423
PS7, Line 3423:   // Ensure that the replicas converge.
> curious: does this test infrastructure also poll the NON_VOTER replica to s
Along with other verification steps, ClusterVerifier employs 
VerifyCommittedOpIdsMatch() which works with local files under  tablet servers 
to verify that OpIds for all existent tablets match.  In that sense, NON_VOTER 
replicas are covered as well, yes.

I added corresponding comment.


http://gerrit.cloudera.org:8080/#/c/8138/7/src/kudu/integration-tests/raft_consensus-itest.cc@3507
PS7, Line 3507:   // Remove the newly added replica.
> can we also assert here that the removed replica gets TOMBSTONED?
That's a good idea, thanks!



--
To view, visit http://gerrit.cloudera.org:8080/8138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2662d45ad9bb6a4bf325d4202c2ee619ffad02b7
Gerrit-Change-Number: 8138
Gerrit-PatchSet: 7
Gerrit-Owner: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <davidral...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Comment-Date: Fri, 06 Oct 2017 23:51:10 +0000
Gerrit-HasComments: Yes

Reply via email to