Thanks for the KIP Kevin. The motivation is clear to me and beneficial
for implementing the auto-join feature designed in KIP-853.

In the "Compatibility ..." section you state the following:
"To make this change backwards compatible, we can make this field
ignorable. This ensures compatibility between a controller that is
adding itself with the new AddRaftVoterRequest via auto-join, and an
old active controller that does not understand this field (i.e. does
not have the auto-join feature). In that case, the unavailability
issue described above is possible."

We have at least two options in the design of this feature when the
leader doesn't support the new flag. One option is what you suggest
which is to ignore the flag and behave how it currently does. As you
point out this can cause the leader to failover if the voter getting
added is needed to commit the new voter set.

The other option is to not send the AddRaftVoter request if the leader
doesn't support this feature. This means that both the leader and the
follower need to support version 1 of AddRaftVoter for the feature to
work. The advantage of this solution is that it doesn't cause any
unavailability in the KRaft partition (the cluster metadata
partition).

My opinion is that we should require that both replicas (leader and
follower) support this feature. This means that the new field should
not be ignored and the sending replica should handle the unsupported
version exception and error. We used a similar mechanism when
designing and implementing the KIP-996: Pre-Vote.

The new field has the following name and documentation:
"{"name": "FromController", "type": "bool", "versions": "1+",
"ignorable": true, "about": "True if the request is sent from another
controller as part of controller auto joining, false if sent via the
admin client."}"

The field name and documentation should reflect the difference in
semantics and less the context and usage of the feature. The
difference in semantics is that for the default value, the RPC
response is sent when the changes are committed. While for the
opposite value, the RPC response is sent when the changes have been
written. With this in mind, how about naming the field
"AckWhenCommitted" where the default value is true. I am open to
changing the field name if you have a better name in mind.

Thanks,
-- 
-José

Reply via email to