Hi Jun,

Thanks for the questions and discussion. I wanted to add one more thing for
your consideration with respect to RE JR2. A server-side implementation on
the controller to do both voter removal and controller unregistration does
not provide a more consistent behavior than having the AdminClient or
`kafka-metadata-quorum` CLI orchestrate this workflow. The only benefit
would be sending one RPC instead of two. At the moment, I don't think this
is a good enough argument for changing a KRaft RPC schema, since the new
field is not used by the KRaft layer. We can always implement this workflow
on the client-side first, and explore the server-side implementation later
if it makes more sense. What do you think?

Best,
Kevin Wu

On Tue, Apr 21, 2026 at 1:38 PM Kevin Wu <[email protected]> wrote:

> Hi Jonah,
>
> Thanks for the reply.
>
> RE JH1: Sure, an operator can mistakenly try to unregister a controller
> that is part of the KRaft voter set. The implementation can have the active
> controller reject unregistration of KRaft voters. Although, the
> registration manager will attempt to re-register, so the cluster would
> "recover" with respect to this state.
>
> Best,
> Kevin Wu
>
> On Tue, Apr 21, 2026 at 12:34 PM Jonah Hooper via dev <
> [email protected]> wrote:
>
>> Thanks for the KIP Kevin!
>>
>> > This is to prevent accidental unregistrations. The intention for
>> unregistration is for it to occur after the operator decommissions a
>> controller node.
>>
>> JH1: The KIP discusses the case where a controller is registered but no
>> voter exists. However, the KIP could potentially add an inverse: the
>> ControllerRegistration (for Node A, for example) is removed successfully
>> but Node A remains healthy and is still a Voter.
>> ControllerRegistrationManager listens to the MetadataPublisher pipeline
>> which is derived with some delay from Raft layer. Since the "controller
>> metadata" layer might not know it is unregistered, it could attempt to
>> reregister. A correctness-property of the UnregisterController workflow is
>> that the controller being unregistered is actually decomissioned. This may
>> be reasonable in this case, but I wonder if there is a way to design this
>> so that it would even work on a controller which is active and part of the
>> quorum.
>>
>> Best,
>> Jonah
>>
>>
>>
>> On Tue, Apr 21, 2026 at 12:38 PM Kevin Wu <[email protected]> wrote:
>>
>> > Hi Paolo,
>> >
>> > I have included a section outlining the AdminClient API changes for this
>> > KIP. Thanks for pointing that out.
>> >
>> > Best,
>> > Kevin Wu
>> >
>> > On Tue, Apr 21, 2026 at 11:24 AM Kevin Wu <[email protected]>
>> wrote:
>> >
>> > > Hi Jun,
>> > >
>> > > Thanks for the reply.
>> > >
>> > > RE JR1: Yeah, I will update KIP to touch on this static quorum edge
>> case.
>> > >
>> > > RE JR2: That seems reasonable to me, since we would avoid two RPC hops
>> > > (one for RemoveVoter, one for UnregisterController). One thing to
>> note is
>> > > that with KIP-1186
>> > > <
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1186%3A+Update+AddRaftVoterRequest+RPC+to+support+auto-join
>> > >,
>> > > besides operators manually removing controllers, observer controllers
>> > > themselves can send `RemoveRaftVoter` to remove their old incarnations
>> > from
>> > > the voter set as part of the auto-join feature. With auto-join and
>> this
>> > > proposed behavior, explicitly removing a controller's old registration
>> > > alongside its old voter set entry can lead to "unsupported" upgrades
>> in
>> > the
>> > > cluster. An operator doing these steps manually can be argued as
>> > > misconfiguring the cluster, but the auto-join feature allowing for
>> this
>> > > scenario seems like a bug.
>> > >
>> > > Consider the below example with auto-join enabled: 3 controllers in
>> the
>> > > voter set (A,B,C) where A supports feature levels X=[0-1], B supports
>> > > feature levels X=[0-1], but C only supports X=0. Currently, node A is
>> the
>> > > active controller, all 3 controllers are registered, but upgrading
>> > feature
>> > > X to feature level 1 is not supported because C does not support it.
>> > > Controller C restarts with a new disk (now represented as C'). The
>> > > auto-join code runs to first remove C from the voter set, and then
>> remove
>> > > the registration for C. These records are committed via nodes A and B.
>> > Now,
>> > > from the active controller's perspective, the cluster does support
>> > > upgrading feature X to level 1. There is a race between C' adding
>> itself
>> > > back to the KRaft voter set and re-registering itself, and a potential
>> > > feature level upgrade. Another interesting thing to note after
>> looking at
>> > > the code is that controllers can register even if they do not support
>> the
>> > > finalized features of the cluster, which is different from broker
>> > > registration. In Kafka's current code, the original registration for C
>> > > stays in the log after C is removed as a voter by auto-join, which
>> > prevents
>> > > an upgrade of feature X. At some point, the registration for C is
>> updated
>> > > by C' because C' is a different process incarnation, but a
>> registration
>> > > that blocks X's upgrade is always in the log.
>> > >
>> > > Therefore, Kafka should not unregister a controller when auto-join
>> > removes
>> > > a controller from the voter set. This means including a new RPC
>> version
>> > for
>> > > `RemoveRaftVoter` that introduces a boolean field telling the active
>> > > controller whether to also unregister the controller. This field
>> would be
>> > > completely ignored by the raft layer, and instead would be handled at
>> the
>> > > ControllerApis level. I think it is fine to unregister a controller
>> > > whenever the operator runs `kafka-metadata-quorum remove-controller`
>> for
>> > a
>> > > smooth UX with dynamic quorum. What do you think?
>> > >
>> > > RE JR3: Maybe we can document this better as part of the code changes
>> to
>> > > this KIP, but in my opinion, the kafka-cluster tool deals with cluster
>> > > membership (brokers and controllers), which is a metadata layer
>> concept.
>> > If
>> > > you look at the `list-endpoints` command, you can list out the
>> registered
>> > > controller endpoints. Alternatively, the kafka-metadata-quorum tool
>> deals
>> > > with KRaft, which knows about concepts like leader, voter, and
>> observers.
>> > > The `add-controller` and `remove-controller` sub-commands
>> inadvertently
>> > > deal with controllers (since controllers can be voters), but the
>> > `describe`
>> > > sub-command tree also shows information about brokers, which are
>> > observers
>> > > to KRaft. My decision to include the `unregister-controller` command
>> in
>> > the
>> > > `kafka-cluster` tool is mainly motivated by this distinction.
>> > Additionally,
>> > > if we only send `RemoveVoterRequest` in `remove-controller`, it seems
>> > hacky
>> > > to direct users to use that command for unregistering any controller,
>> > since
>> > > for observers, the remove voter logic of that request will always
>> fail in
>> > > the raft layer. What do you think?
>> > >
>> > > Best,
>> > > Kevin Wu
>> > >
>> > >
>> > > On Tue, Apr 21, 2026 at 8:17 AM Paolo Patierno <
>> [email protected]
>> > >
>> > > wrote:
>> > >
>> > >> Hi Kevin,
>> > >> thanks for the KIP.
>> > >> From reading it, it's not clear because not explicit, but I would
>> assume
>> > >> you are going to expose a new unregisterController method through the
>> > >> AdminClient API as well, is my assumption right?
>> > >> I expect it would be used underneath by the tools you are going to
>> > modify.
>> > >> Having such support within the AdminClient API is important when the
>> > >> operator is not a human to run the tool but a Kubernetes operator
>> (i.e.
>> > >> Strimzi) with the need to unregister a controller.
>> > >>
>> > >> Thanks,
>> > >> Paolo.
>> > >>
>> > >> On Mon, 20 Apr 2026 at 21:57, Kevin Wu <[email protected]>
>> wrote:
>> > >>
>> > >> > Hi Jun,
>> > >> >
>> > >> > Thanks for the reply.
>> > >> >
>> > >> > RE JR1: I would say the main use case is dynamic quorums, since the
>> > >> concept
>> > >> > of the observer controller becomes a thing in that world. However,
>> > >> there is
>> > >> > a static quorum edge case if the operator misconfigures
>> > >> > `controller.quorum.voters`. If a new controller voter mistakenly
>> joins
>> > >> the
>> > >> > cluster, it will also persist a registration record. In my opinion,
>> > >> there
>> > >> > should be a way to remove a controller registration via AdminClient
>> > CLI
>> > >> in
>> > >> > all quorum modes.
>> > >> >
>> > >> > RE JR2: Yes, the existing command only removes the voter, but does
>> not
>> > >> > unregister the controller. I left it as a separate flag for now
>> > because
>> > >> > they are "separate" operations in that being a raft voter is a
>> subset
>> > of
>> > >> > being a controller in dynamic quorums, but I am not opposed to
>> making
>> > >> this
>> > >> > command try to do both (remove voter and unregister the
>> controller) by
>> > >> > default. In my opinion, an observer controller is "useless" in
>> that it
>> > >> does
>> > >> > not participate in the leader election or replication parts of the
>> > KRaft
>> > >> > protocol, so I see no issue with doing both operations always.
>> > However,
>> > >> an
>> > >> > operator may want observer controllers around for other reasons
>> like
>> > >> > redundancy. Do you (or others) have any insight into how users may
>> be
>> > >> > configuring clusters with observer controllers? If not, I think it
>> is
>> > >> okay
>> > >> > to remove the flag and make it the default behavior of
>> > >> > `kafka-metadata-quorum remove-controller`.
>> > >> >
>> > >> > RE JR3: Not exactly. The `kafka-metadata-quorum remove-controller
>> ...
>> > >> > --unregister` sends 2 RPCs to the active controller, one to remove
>> a
>> > >> node
>> > >> > from the voter set, and another to unregister the node. The
>> > >> `kafka-cluster
>> > >> > unregister-controller` command just sends 1 RPC to the active
>> > >> controller to
>> > >> > unregister the node. My motivation for having two separate
>> commands is
>> > >> > because `remove-controller` is associated with dynamic quorum,
>> since
>> > the
>> > >> > `RemoveRaftVoterRPC` will fail if the kraft.version=0. What do you
>> > >> think?
>> > >> >
>> > >> > RE JR4: I have updated the sections for the CLI commands in the
>> KIP to
>> > >> add
>> > >> > this information.
>> > >> >
>> > >> > RE JR5: This is describing the current implementation of the
>> > >> > ControllerRegistrationManager, which will listen to the metadata
>> log
>> > and
>> > >> > send ControllerRegistrationRequest when the local node id is not
>> > >> registered
>> > >> > in the log. It looks like this is slightly different from how we
>> > handle
>> > >> > broker registration in BrokerLifecycleManager. Currently, this code
>> > path
>> > >> > never executes because controller registrations cannot be removed.
>> > >> >
>> > >> > Best,
>> > >> > Kevin Wu
>> > >> >
>> > >> > On Fri, Apr 17, 2026 at 2:08 PM Jun Rao via dev <
>> [email protected]
>> > >
>> > >> > wrote:
>> > >> >
>> > >> > > Hi, Kevin,
>> > >> > >
>> > >> > > Thanks for the KIP. A few comments.
>> > >> > >
>> > >> > > JR1. I guess this is only intended for dynamic KRaft quorums? If
>> so,
>> > >> it
>> > >> > > would be useful to clarify that.
>> > >> > >
>> > >> > > JR2. kafka-metadata-quorum remove-controller --controller-id 9990
>> > >> > > --controller-directory-id EXAMPLE_UUID --unregister
>> > >> > > So, the existing remove-controller logic only changes the voter
>> set,
>> > >> but
>> > >> > > doesn't unregister the controller? Should we just always do these
>> > two
>> > >> > > together? Is there a use case for only removing a controller from
>> > the
>> > >> > voter
>> > >> > > set, but not unregsitering?
>> > >> > >
>> > >> > > JR3. Is kafka-cluster unregister-controller equivalent to
>> > >> > > kafka-metadata-quorum remove-controller --controller-id 9990
>> > >> > > --controller-directory-id EXAMPLE_UUID --unregister?
>> > >> > >
>> > >> > > JR4. Could you describe the underlying workflow for each new
>> command
>> > >> > (RPCs
>> > >> > > sent, metadata records generated, actions taken by the
>> controller,
>> > >> etc)?
>> > >> > >
>> > >> > > JR5. "The registration manager of an unregistered controller
>> already
>> > >> > > attempts to re-register with the active controller. This is to
>> > prevent
>> > >> > > accidental unregistrations."
>> > >> > > I don't quite understand this. Why will an unregistered
>> controller
>> > >> > attempt
>> > >> > > to re-register?
>> > >> > >
>> > >> > > Jun
>> > >> > >
>> > >> > > On Fri, Apr 3, 2026 at 11:31 AM Kevin Wu <[email protected]
>> >
>> > >> wrote:
>> > >> > >
>> > >> > > > Hi all,
>> > >> > > >
>> > >> > > > I would like to start a discussion on KIP-1312: Support
>> > >> unregistering
>> > >> > > > controllers. Below is the KIP link.
>> > >> > > >
>> > >> > > >
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1312%3A+Support+unregistering+controllers
>> > >> > > >
>> > >> > > > Thanks,
>> > >> > > > Kevin Wu
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> > >>
>> > >> --
>> > >> Paolo Patierno
>> > >>
>> > >> *Senior Principal Software Engineer @ IBM**CNCF Ambassador*
>> > >>
>> > >> Twitter : @ppatierno <http://twitter.com/ppatierno>
>> > >> Linkedin : paolopatierno <http://it.linkedin.com/in/paolopatierno>
>> > >> GitHub : ppatierno <https://github.com/ppatierno>
>> > >>
>> > >
>> >
>>
>

Reply via email to