Hi Kafka developers,

I am planning a Kafka upgrade and I would like to clarify two points where
the current upgrade documentation is ambiguous to me.

Current environment:

Kafka version: 3.7.x
Target version: 4.2.x
Mode: KRaft
Deployment shape: dedicated controller nodes and dedicated broker nodes
Controller quorum: [static controller.quorum.voters]

The Kafka 4.2 upgrade page has a section titled "Upgrading Servers to 4.2.0
from any version 3.3.x through 4.1.x". However, in the detailed Kafka 4.0
upgrade instructions, the rolling upgrade step says to "upgrade the brokers
one at a time" and does not explicitly mention isolated KRaft controllers.

I have two main questions:

1. For a KRaft cluster with dedicated controllers and dedicated brokers,
what is the recommended rolling upgrade order when upgrading from Kafka
3.7.x to Kafka 4.2.x?

Should we upgrade:

* controllers first, one at a time, preserving controller quorum majority;
* brokers first, one at a time;
* or is there no strict order as long as every Kafka server process is
upgraded one by one and controller quorum availability is maintained?

I am especially interested in whether there are compatibility concerns
between Kafka 3.7 controllers and Kafka 4.2 brokers, or Kafka 4.2
controllers and Kafka 3.7 brokers, during the rolling window.

2. Is a direct rolling upgrade from Kafka 3.7.x to Kafka 4.2.x considered
supported and safe for production KRaft clusters?

The Kafka 4.2 documentation wording appears to include upgrades from 3.3.x
through 4.1.x, which seems to include 3.7.x. I want to confirm whether a
direct 3.7.x -> 4.2.x rolling binary upgrade is recommended, or whether
operators should first upgrade to an intermediate version such as 3.9.x,
4.0.x, or 4.1.x.

Related follow-up questions:

* Should metadata.version and kraft.version finalization be delayed until
all brokers and all controllers are running the target version?
* Are there any extra precautions for clusters using static
controller.quorum.voters versus dynamic controller quorum?

My current understanding is:

* Kafka 4.x requires KRaft, so ZooKeeper-mode clusters cannot go directly
to 4.x.
* In KRaft mode, controllers are Kafka server processes too, so they should
not be ignored during the rolling upgrade.
* Feature or metadata finalization should happen only after the whole
cluster has been upgraded and verified.

Could someone confirm the recommended production-safe upgrade sequence?

Thanks,
Alireza Asgarian

Reply via email to