Hi Jun,

Thanks for the reply.

RE JR1: Yeah, I will update KIP to touch on this static quorum edge case.

RE JR2: That seems reasonable to me, since we would avoid two RPC hops (one
for RemoveVoter, one for UnregisterController). One thing to note is that
with KIP-1186
<https://cwiki.apache.org/confluence/display/KAFKA/KIP-1186%3A+Update+AddRaftVoterRequest+RPC+to+support+auto-join>,
besides operators manually removing controllers, observer controllers
themselves can send `RemoveRaftVoter` to remove their old incarnations from
the voter set as part of the auto-join feature. With auto-join and this
proposed behavior, explicitly removing a controller's old registration
alongside its old voter set entry can lead to "unsupported" upgrades in the
cluster. An operator doing these steps manually can be argued as
misconfiguring the cluster, but the auto-join feature allowing for this
scenario seems like a bug.

Consider the below example with auto-join enabled: 3 controllers in the
voter set (A,B,C) where A supports feature levels X=[0-1], B supports
feature levels X=[0-1], but C only supports X=0. Currently, node A is the
active controller, all 3 controllers are registered, but upgrading feature
X to feature level 1 is not supported because C does not support it.
Controller C restarts with a new disk (now represented as C'). The
auto-join code runs to first remove C from the voter set, and then remove
the registration for C. These records are committed via nodes A and B. Now,
from the active controller's perspective, the cluster does support
upgrading feature X to level 1. There is a race between C' adding itself
back to the KRaft voter set and re-registering itself, and a potential
feature level upgrade. Another interesting thing to note after looking at
the code is that controllers can register even if they do not support the
finalized features of the cluster, which is different from broker
registration. In Kafka's current code, the original registration for C
stays in the log after C is removed as a voter by auto-join, which prevents
an upgrade of feature X. At some point, the registration for C is updated
by C' because C' is a different process incarnation, but a registration
that blocks X's upgrade is always in the log.

Therefore, Kafka should not unregister a controller when auto-join removes
a controller from the voter set. This means including a new RPC version for
`RemoveRaftVoter` that introduces a boolean field telling the active
controller whether to also unregister the controller. This field would be
completely ignored by the raft layer, and instead would be handled at the
ControllerApis level. I think it is fine to unregister a controller
whenever the operator runs `kafka-metadata-quorum remove-controller` for a
smooth UX with dynamic quorum. What do you think?

RE JR3: Maybe we can document this better as part of the code changes to
this KIP, but in my opinion, the kafka-cluster tool deals with cluster
membership (brokers and controllers), which is a metadata layer concept. If
you look at the `list-endpoints` command, you can list out the registered
controller endpoints. Alternatively, the kafka-metadata-quorum tool deals
with KRaft, which knows about concepts like leader, voter, and observers.
The `add-controller` and `remove-controller` sub-commands inadvertently
deal with controllers (since controllers can be voters), but the `describe`
sub-command tree also shows information about brokers, which are observers
to KRaft. My decision to include the `unregister-controller` command in the
`kafka-cluster` tool is mainly motivated by this distinction. Additionally,
if we only send `RemoveVoterRequest` in `remove-controller`, it seems hacky
to direct users to use that command for unregistering any controller, since
for observers, the remove voter logic of that request will always fail in
the raft layer. What do you think?

Best,
Kevin Wu


On Tue, Apr 21, 2026 at 8:17 AM Paolo Patierno <[email protected]>
wrote:

> Hi Kevin,
> thanks for the KIP.
> From reading it, it's not clear because not explicit, but I would assume
> you are going to expose a new unregisterController method through the
> AdminClient API as well, is my assumption right?
> I expect it would be used underneath by the tools you are going to modify.
> Having such support within the AdminClient API is important when the
> operator is not a human to run the tool but a Kubernetes operator (i.e.
> Strimzi) with the need to unregister a controller.
>
> Thanks,
> Paolo.
>
> On Mon, 20 Apr 2026 at 21:57, Kevin Wu <[email protected]> wrote:
>
> > Hi Jun,
> >
> > Thanks for the reply.
> >
> > RE JR1: I would say the main use case is dynamic quorums, since the
> concept
> > of the observer controller becomes a thing in that world. However, there
> is
> > a static quorum edge case if the operator misconfigures
> > `controller.quorum.voters`. If a new controller voter mistakenly joins
> the
> > cluster, it will also persist a registration record. In my opinion, there
> > should be a way to remove a controller registration via AdminClient CLI
> in
> > all quorum modes.
> >
> > RE JR2: Yes, the existing command only removes the voter, but does not
> > unregister the controller. I left it as a separate flag for now because
> > they are "separate" operations in that being a raft voter is a subset of
> > being a controller in dynamic quorums, but I am not opposed to making
> this
> > command try to do both (remove voter and unregister the controller) by
> > default. In my opinion, an observer controller is "useless" in that it
> does
> > not participate in the leader election or replication parts of the KRaft
> > protocol, so I see no issue with doing both operations always. However,
> an
> > operator may want observer controllers around for other reasons like
> > redundancy. Do you (or others) have any insight into how users may be
> > configuring clusters with observer controllers? If not, I think it is
> okay
> > to remove the flag and make it the default behavior of
> > `kafka-metadata-quorum remove-controller`.
> >
> > RE JR3: Not exactly. The `kafka-metadata-quorum remove-controller ...
> > --unregister` sends 2 RPCs to the active controller, one to remove a node
> > from the voter set, and another to unregister the node. The
> `kafka-cluster
> > unregister-controller` command just sends 1 RPC to the active controller
> to
> > unregister the node. My motivation for having two separate commands is
> > because `remove-controller` is associated with dynamic quorum, since the
> > `RemoveRaftVoterRPC` will fail if the kraft.version=0. What do you think?
> >
> > RE JR4: I have updated the sections for the CLI commands in the KIP to
> add
> > this information.
> >
> > RE JR5: This is describing the current implementation of the
> > ControllerRegistrationManager, which will listen to the metadata log and
> > send ControllerRegistrationRequest when the local node id is not
> registered
> > in the log. It looks like this is slightly different from how we handle
> > broker registration in BrokerLifecycleManager. Currently, this code path
> > never executes because controller registrations cannot be removed.
> >
> > Best,
> > Kevin Wu
> >
> > On Fri, Apr 17, 2026 at 2:08 PM Jun Rao via dev <[email protected]>
> > wrote:
> >
> > > Hi, Kevin,
> > >
> > > Thanks for the KIP. A few comments.
> > >
> > > JR1. I guess this is only intended for dynamic KRaft quorums? If so, it
> > > would be useful to clarify that.
> > >
> > > JR2. kafka-metadata-quorum remove-controller --controller-id 9990
> > > --controller-directory-id EXAMPLE_UUID --unregister
> > > So, the existing remove-controller logic only changes the voter set,
> but
> > > doesn't unregister the controller? Should we just always do these two
> > > together? Is there a use case for only removing a controller from the
> > voter
> > > set, but not unregsitering?
> > >
> > > JR3. Is kafka-cluster unregister-controller equivalent to
> > > kafka-metadata-quorum remove-controller --controller-id 9990
> > > --controller-directory-id EXAMPLE_UUID --unregister?
> > >
> > > JR4. Could you describe the underlying workflow for each new command
> > (RPCs
> > > sent, metadata records generated, actions taken by the controller,
> etc)?
> > >
> > > JR5. "The registration manager of an unregistered controller already
> > > attempts to re-register with the active controller. This is to prevent
> > > accidental unregistrations."
> > > I don't quite understand this. Why will an unregistered controller
> > attempt
> > > to re-register?
> > >
> > > Jun
> > >
> > > On Fri, Apr 3, 2026 at 11:31 AM Kevin Wu <[email protected]>
> wrote:
> > >
> > > > Hi all,
> > > >
> > > > I would like to start a discussion on KIP-1312: Support unregistering
> > > > controllers. Below is the KIP link.
> > > >
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1312%3A+Support+unregistering+controllers
> > > >
> > > > Thanks,
> > > > Kevin Wu
> > > >
> > >
> >
>
>
> --
> Paolo Patierno
>
> *Senior Principal Software Engineer @ IBM**CNCF Ambassador*
>
> Twitter : @ppatierno <http://twitter.com/ppatierno>
> Linkedin : paolopatierno <http://it.linkedin.com/in/paolopatierno>
> GitHub : ppatierno <https://github.com/ppatierno>
>

Reply via email to