Hey folks, I just wanted to update this thread with some minor changes I've made to the KIP based on the implementation in 3.4.
* No new ApiVersionsResponse, we use the "metadata.version" feature flag instead * Pre-Migration instead of "Soft Start" for the phase of the controller before it can accept metadata changes * Add "kraftControllerEpoch" to /controller ZNode instead just the "isKRaft" boolean * Rename ZkMigrationRecord to ZkMigrationStateRecord * In RegisterBrokerRecord, rename "IsZkBroker" to "IsMigratingZkBroker" * In BrokerRegistrationRequest, rename "ZkMigrationReady" to "IsMigratingZkBroker" (no longer a tagged field) * In the controller RPCs, use "isKraftController" boolean field instead of "KRaftControllerId" All of these changes were made prior to the release, so there's no compatibility issues. I just wanted to update the KIP to make it reflect reality. Cheers, David On Tue, Nov 29, 2022 at 10:14 PM Luke Chen <show...@gmail.com> wrote: > Thanks for the reply. > Yes, I think adding a section in Rejected Alternatives to explain the > rationale why we don't support combined mode upgrade in this KIP is > helpful. > > Thank you. > Luke > > On Wed, Nov 30, 2022 at 8:47 AM Jun Rao <j...@confluent.io.invalid> wrote: > > > Hi, David, > > > > Thanks for the reply. No other comments from me. > > > > Jun > > > > On Tue, Nov 29, 2022 at 2:57 PM David Arthur > > <david.art...@confluent.io.invalid> wrote: > > > > > Jun, > > > > > > 51. You're right, I missed that in the latest update. It's fixed now. > > > > > > 54. I was thinking we could update meta.properties to v1 as the brokers > > > were being restarted in KRaft mode without the migration config set. > > > However, as you mentioned, it is still possible to downgrade even then > > (as > > > long as the controller has not exited dual-write mode). Upgrading the > > > meta.properties after seeing the final ZkMigrationRecord sounds like a > > good > > > idea to me. I've updated the KIP to include this detail under > > > "Meta.Properties" section. > > > > > > 58. Yes, the metadata migration from ZK to KRaft will not migrate the > > > contents of /brokers. > > > > > > Thanks, > > > David > > > > > > On Tue, Nov 29, 2022 at 4:50 PM Jun Rao <j...@confluent.io.invalid> > > wrote: > > > > > > > Hi, David, > > > > > > > > Thanks for the reply. > > > > > > > > 51. Is this reflected in the KIP? It seems that ZkMigrationState > still > > > has > > > > the None value. > > > > > > > > 54. Supporting both v0 and v1 indefinitely in a KRaft broker could > be a > > > bit > > > > confusing and may complicate future upgrades. Another approach is to > > let > > > > KRaft broker write the v1 meta.properties after the KRaft controller > > > exits > > > > the dual write mode. We could extend ZkMigrationRecord to 3 states > like > > > > migration, dualWrite and KRaftOnly. Once a broker sees KRaftOnly, it > > will > > > > write meta.properties in v1 format. At that point, downgrade could > > cause > > > > metadata loss and require manual work. Will that work? > > > > > > > > 58. When copying metadata from ZK to KRaft, I guess we will ignore > > broker > > > > registration since the KRaft controller has already generated a > > > > BrokerRegistrationRecord based on BrokerRegistrationRequest? > > > > > > > > Thanks, > > > > > > > > Jun > > > > > > > > On Tue, Nov 29, 2022 at 7:14 AM David Arthur > > > > <david.art...@confluent.io.invalid> wrote: > > > > > > > > > Jun, Thanks for the comments. Igor, please see 54 below for some > > > > additional > > > > > discussion on the meta.properties > > > > > > > > > > 50.1 Yes, that field name sounds fine to me. > > > > > > > > > > 50.2 Ok, I'll add something to the KIP under the Controller > section. > > To > > > > > your other question, NoOpRecords are used as part of our liveness > > check > > > > for > > > > > the quorum. It doesn't produce any metadata really, so I don't > think > > it > > > > > causes any harm to let it happen before the migration. KIP-835 has > > the > > > > > details on the NoOpRecords > > > > > > > > > > 54. Colin and I discussed the meta.properties issue last night. How > > > about > > > > > we simply let the KRaft broker accept v0 or v1 meta.properties. At > > this > > > > > point, the two versions have the same contents, but different field > > > > names. > > > > > By leaving the meta.properties intact, we can simplify the > downgrade > > > > > process. If we care to, we could rewrite meta.properties once a > > broker > > > is > > > > > restarted after the migration is finalized (migration config > > disabled). > > > > > > > > > > 57. If a ZK broker can't send a BrokerRegistrationRequest because > the > > > > > quorum is unavailable, it should just continue operating normally. > > > Once a > > > > > leader is available, the broker will send the registration and > start > > > > > heart-beating. Unlike KRaft mode, we won't block startup on a > > > successful > > > > > BrokerRegistration response. Concretely, BrokerLifecycleManager > will > > > keep > > > > > trying to contact the quorum in its own thread until the > > > > > BrokerToChannelManager gets a controller ID from KafkaRaftManager. > > This > > > > > shouldn't interfere with other ZK broker activity. > > > > > > > > > > -David > > > > > > > > > > > > > > > > > > > > > -- > > > > > -David > > > > > > > > > > > > > > > > > > -- > > > -David > > > > > > -- -David