Hi Kafka developers, I am planning a Kafka upgrade and I would like to clarify two points where the current upgrade documentation is ambiguous to me.
Current environment: Kafka version: 3.7.x Target version: 4.2.x Mode: KRaft Deployment shape: dedicated controller nodes and dedicated broker nodes Controller quorum: [static controller.quorum.voters] The Kafka 4.2 upgrade page has a section titled "Upgrading Servers to 4.2.0 from any version 3.3.x through 4.1.x". However, in the detailed Kafka 4.0 upgrade instructions, the rolling upgrade step says to "upgrade the brokers one at a time" and does not explicitly mention isolated KRaft controllers. I have two main questions: 1. For a KRaft cluster with dedicated controllers and dedicated brokers, what is the recommended rolling upgrade order when upgrading from Kafka 3.7.x to Kafka 4.2.x? Should we upgrade: * controllers first, one at a time, preserving controller quorum majority; * brokers first, one at a time; * or is there no strict order as long as every Kafka server process is upgraded one by one and controller quorum availability is maintained? I am especially interested in whether there are compatibility concerns between Kafka 3.7 controllers and Kafka 4.2 brokers, or Kafka 4.2 controllers and Kafka 3.7 brokers, during the rolling window. 2. Is a direct rolling upgrade from Kafka 3.7.x to Kafka 4.2.x considered supported and safe for production KRaft clusters? The Kafka 4.2 documentation wording appears to include upgrades from 3.3.x through 4.1.x, which seems to include 3.7.x. I want to confirm whether a direct 3.7.x -> 4.2.x rolling binary upgrade is recommended, or whether operators should first upgrade to an intermediate version such as 3.9.x, 4.0.x, or 4.1.x. Related follow-up questions: * Should metadata.version and kraft.version finalization be delayed until all brokers and all controllers are running the target version? * Are there any extra precautions for clusters using static controller.quorum.voters versus dynamic controller quorum? My current understanding is: * Kafka 4.x requires KRaft, so ZooKeeper-mode clusters cannot go directly to 4.x. * In KRaft mode, controllers are Kafka server processes too, so they should not be ignored during the rolling upgrade. * Feature or metadata finalization should happen only after the whole cluster has been upgraded and verified. Could someone confirm the recommended production-safe upgrade sequence? Thanks, Alireza Asgarian
