Tom Bentley wrote: > Thanks for the KIP. As Justine mentioned, this KIP currently lacks a > motivation, and nor does the JIRA provide any context. Please could you > provide this context, otherwise it's impossible for people on this list to > understand the problem you're trying to solve here.
Justine Olshan wrote: > I was curious a bit more about the motivation here. That section seems to be > missing. I updated the motivation section with the following text: KIP-595 introduced KRaft topic partitions. These are partitions with replicas that can achieve consensus on the Kafka log without relying on the Controller or ZK. The KRaft Controllers in KIP-631 use one of these topic partitions (called cluster metadata topic partition) to order operations on the cluster, commit them to disk and replicate them to other controllers and brokers. Consensus on the cluster metadata partition was achieved by the voters (Controllers). If the operator of a KRaft cluster wanted to make changes to the set of voters, they would have to shutdown all of the controllers nodes and manually make changes to the on-disk state of the old controllers and new controllers. If the operator wanted to replace an existing voter because of a disk failure or general hardware failure, they would have to make sure that the new voter node has a superset of the previous voter's on-disk state. Both of these solutions are manual and error prone. This KIP describes a protocol for extending KIP-595 and KIP-630 so that the operator can programmatically update the voter set in a way that is safe and is available. There are two important use cases that this KIP supports. One use case is that the operator wants to change the number of controllers by adding or removing a controller. The other use case is that the operation wants to replace a controller because of a disk or hardware failure. Thanks! -- -José