Hi, David,

Thanks for the reply.

Now that we have added KRaftControllerId to both UpdateMetdataRequest and
LeaderAndIsrRequest, should we add it to StopReplicaRequest to make it more
consistent?

Thanks,

Jun

On Wed, Nov 9, 2022 at 8:17 AM David Arthur <mum...@gmail.com> wrote:

> Jun, I've updated the doc with clarification of combined mode under
> the ApiVersionsResponse section. I also added your suggestion to
> rename the config to something more explicit. Colin had some feedback
> in the VOTE thread which I've also incorporated.
>
> Thanks!
> David
>
> On Tue, Nov 8, 2022 at 1:18 PM Jun Rao <j...@confluent.io.invalid> wrote:
> >
> > Hi, David,
> >
> > Thanks for the reply.
> >
> > 20/21. Thanks for the explanation. The suggestion sounds good. Could you
> > include that in the doc?
> >
> > 40. metadata.migration.enable: We may do future metadata migrations
> within
> > KRaft. Could we make the name more specific to ZK migration like
> > zookeeper.metadata.migration.enable?
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Nov 8, 2022 at 9:47 AM David Arthur <mum...@gmail.com> wrote:
> >
> > > Ah, sorry about that, you're right. Since we won't support ZK
> > > migrations in combined mode, is this issue avoided?
> > >
> > > Essentially, we would only set ZkMigrationReady in ApiVersionsResponse
> if
> > >
> > > * process.roles=controller
> > > * kafka.metadata.migration.enabled=true
> > >
> > > Also, the following would be an invalid config that should prevent
> startup:
> > >
> > > * process.roles=broker,controller
> > > * kafka.metadata.migration.enabled=true
> > >
> > > Does this seem reasonable?
> > >
> > > Thanks!
> > > David
> > >
> > > On Tue, Nov 8, 2022 at 11:12 AM Jun Rao <j...@confluent.io.invalid>
> wrote:
> > > >
> > > > Hi, David,
> > > >
> > > > I am not sure that we are fully settled on the following.
> > > >
> > > > 20/21. Since separate listeners are optional, it seems that the
> broker
> > > > can't distinguish between ApiVersion requests coming from the client
> or
> > > > other brokers. This means the clients will get ZkMigrationReady in
> the
> > > > ApiVersion response, which is weird.
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Tue, Nov 8, 2022 at 7:18 AM David Arthur <mum...@gmail.com>
> wrote:
> > > >
> > > > > Thanks for the discussion everyone, I'm going to move ahead with
> the
> > > > > vote for this KIP.
> > > > >
> > > > > -David
> > > > >
> > > > > On Thu, Nov 3, 2022 at 1:20 PM Jun Rao <j...@confluent.io.invalid>
> > > wrote:
> > > > > >
> > > > > > Hi, David,
> > > > > >
> > > > > > Thanks for the reply.
> > > > > >
> > > > > > 20/21 Yes, but separate listeners are optional. It's possible
> for the
> > > > > nodes
> > > > > > to use a single port for both client and server side
> communications.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > > On Thu, Nov 3, 2022 at 9:59 AM David Arthur
> > > > > > <david.art...@confluent.io.invalid> wrote:
> > > > > >
> > > > > > > 20/21, in combined mode we still have a separate listener for
> the
> > > > > > > controller APIs, e.g.,
> > > > > > >
> > > > > > > listeners=PLAINTEXT://:9092,CONTROLLER://:9093
> > > > > > >
> > > > > > > inter.broker.listener.name=PLAINTEXT
> > > > > > >
> > > > > > > controller.listener.names=CONTROLLER
> > > > > > >
> > > > > > > advertised.listeners=PLAINTEXT://localhost:9092
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Clients still talk to the broker through the advertised
> listener,
> > > and
> > > > > only
> > > > > > > brokers and other controllers will communicate over the
> controller
> > > > > > > listener.
> > > > > > >
> > > > > > > 40. Sounds good, I updated the KIP
> > > > > > >
> > > > > > > Thanks!
> > > > > > > David
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Nov 3, 2022 at 12:14 PM Jun Rao
> <j...@confluent.io.invalid>
> > > > > wrote:
> > > > > > >
> > > > > > > > Hi, David,
> > > > > > > >
> > > > > > > > Thanks for the reply.
> > > > > > > >
> > > > > > > > 20/21. When KRaft runs in the combined mode, does a
> controller
> > > know
> > > > > > > whether
> > > > > > > > an ApiRequest is from a client or another broker?
> > > > > > > >
> > > > > > > > 40. Adding a "None" state sounds reasonable.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jun
> > > > > > > >
> > > > > > > > On Thu, Nov 3, 2022 at 8:39 AM David Arthur
> > > > > > > > <david.art...@confluent.io.invalid> wrote:
> > > > > > > >
> > > > > > > > > Jun,
> > > > > > > > >
> > > > > > > > > 20/21 If we use a tagged field, then I don't think clients
> > > need to
> > > > > be
> > > > > > > > > concerned with it, right?. In ApiVersionsResponse sent by
> > > brokers
> > > > > to
> > > > > > > > > clients, this field would be omitted. Clients can't connect
> > > > > directly to
> > > > > > > > the
> > > > > > > > > KRaft controller nodes. Also, we have a precedent of
> sending
> > > > > controller
> > > > > > > > > node state between controllers through ApiVersions
> > > > > ("metadata.version"
> > > > > > > > > feature), so I think it fits well.
> > > > > > > > >
> > > > > > > > > 40. For new KRaft clusters, we could add another state to
> > > indicate
> > > > > it
> > > > > > > was
> > > > > > > > > not migrated from ZK, but created as a KRaft cluster. Maybe
> > > > > > > > >
> "kafka.controller:type=KafkaController,name=MigrationState" =>
> > > > > "None" ?
> > > > > > > > We
> > > > > > > > > could also omit that metric for unmigrated clusters, but
> I'm
> > > not a
> > > > > fan
> > > > > > > of
> > > > > > > > > using the absence of a value to signal something.
> > > > > > > > >
> > > > > > > > > -----
> > > > > > > > >
> > > > > > > > > Akhilesh, thanks for reviewing the KIP!
> > > > > > > > >
> > > > > > > > > 1. MigrationState and MetadataType are mostly the same on
> the
> > > > > > > controller,
> > > > > > > > > but we do have the "MigratingZkData" state that seems
> useful to
> > > > > report
> > > > > > > > as a
> > > > > > > > > metric. Aside from looking at logs, observing the
> controller in
> > > > > this
> > > > > > > > state
> > > > > > > > > is the only way to see how long its taking to copy data
> from
> > > ZK.
> > > > > > > > >
> > > > > > > > > "KRaftMode" instead of "MigrationFinalized" is similar to
> Jun's
> > > > > > > question
> > > > > > > > > about non-migrated clusters. I think it's useful to have a
> > > distinct
> > > > > > > > > MigrationState for clusters that have been migrated and
> those
> > > that
> > > > > were
> > > > > > > > > never migrated. This does mean we'll report the
> MigrationState
> > > long
> > > > > > > after
> > > > > > > > > the migration is complete, but we can drop these metrics
> in 4.0
> > > > > once ZK
> > > > > > > > is
> > > > > > > > > removed.
> > > > > > > > >
> > > > > > > > > 2. The "ZkMigrationReady" will indicate that the
> controller has
> > > > > > > > > "kafka.metadata.migration.enable" _and_ the ZK configs
> set. We
> > > need
> > > > > > > some
> > > > > > > > > way to indicate that the whole quorum is correctly
> configured
> > > to
> > > > > handle
> > > > > > > > the
> > > > > > > > > migration so we don't failover to a controller that's not
> > > > > configured
> > > > > > > for
> > > > > > > > > ZK. Did I understand your question correctly?
> > > > > > > > >
> > > > > > > > > 3. Yea, good idea. While the KRaft controller has
> > > > > > > > > MigrationState=MigrationIneligible, we could also report
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> "kafka.controller:type=KafkaController,name=MigrationInelgibleBrokerCount".
> > > > > > > > > It might be useful to report ineligible controllers as well
> > > since
> > > > > that
> > > > > > > > can
> > > > > > > > > prevent the migration from starting.
> > > > > > > > >
> > > > > > > > > 4. I think I covered this in "Incompatible Brokers". We
> > > effectively
> > > > > > > fence
> > > > > > > > > these brokers by not sending them metadata RPCs.
> > > > > > > > >
> > > > > > > > > Thanks!
> > > > > > > > > David
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Tue, Nov 1, 2022 at 3:58 PM Akhilesh Chaganti <
> > > > > > > akhilesh....@gmail.com
> > > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi David,
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Thanks for the KIP. I have some questions/suggestions.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > 1) I see two new metrics:
> > > > > > > > > > kafka.controller:type=KafkaController,name=MetadataType
> and
> > > > > > > > > >
> kafka.controller:type=KafkaController,name=MigrationState.
> > > Won't
> > > > > the
> > > > > > > > > second
> > > > > > > > > > metric already cover the cases of the first metric? Also,
> > > > > instead of
> > > > > > > > > > MigrationFinalized, we could directly say the state is
> > > > > KRaftMode. So
> > > > > > > we
> > > > > > > > > can
> > > > > > > > > > use the same value for default KRaft clusters.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > 2) ZkMigrationReady in ApiVersionsResponse from KRaft
> > > > > Controller. By
> > > > > > > > > > default, we plan to start the Controller quorum in "
> > > > > > > > > > *kafka.metadata.migration.enable*" config set to true.
> Then
> > > do we
> > > > > > > need
> > > > > > > > > this
> > > > > > > > > > additional information again to make sure The
> controllers are
> > > > > ready
> > > > > > > for
> > > > > > > > > > migration? What would happen if the Controller assumes
> it is
> > > > > ready
> > > > > > > for
> > > > > > > > > > migration from 3.4 by default if it doesn't see both
> > > > > > > MigrationMetadata
> > > > > > > > > > records?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > 3) I see that we do not impose order on rolling the
> brokers
> > > with
> > > > > > > > > migration
> > > > > > > > > > flags and provisioning the controller quorum. Along with
> the
> > > > > KRaft
> > > > > > > > > > controller emitting migration state metrics, it may be
> > > better to
> > > > > emit
> > > > > > > > the
> > > > > > > > > > broker count for the brokers not ready for migration yet.
> > > This
> > > > > will
> > > > > > > > give
> > > > > > > > > us
> > > > > > > > > > more insight into any roll issues.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > 4) Once the KRaft controller is in migration mode, we
> should
> > > also
> > > > > > > > > > prevent/handle ZkBrokerRegistrations that don't enable
> > > migration
> > > > > > > mode.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > > > Akhilesh
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Tue, Nov 1, 2022 at 10:49 AM Jun Rao
> > > <j...@confluent.io.invalid
> > > > > >
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi, David,
> > > > > > > > > > >
> > > > > > > > > > > Thanks for the reply.
> > > > > > > > > > >
> > > > > > > > > > > 20/21. Regarding the new ZkMigrationReady field in
> > > > > > > > ApiVersionsResponse,
> > > > > > > > > > it
> > > > > > > > > > > seems that this is a bit intrusive since it exposes
> > > unneeded
> > > > > info
> > > > > > > to
> > > > > > > > > the
> > > > > > > > > > > clients. Another option is to add that field as part
> of the
> > > > > Fetch
> > > > > > > > > > request.
> > > > > > > > > > > We can choose to only set that field in the very first
> > > Fetch
> > > > > > > request
> > > > > > > > > > from a
> > > > > > > > > > > Quorum follower.
> > > > > > > > > > >
> > > > > > > > > > > 40. For
> > > > > kafka.controller:type=KafkaController,name=MigrationState,
> > > > > > > > what
> > > > > > > > > > is
> > > > > > > > > > > the value for a brand new KRaft cluster?
> > > > > > > > > > >
> > > > > > > > > > > Jun
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Oct 31, 2022 at 2:35 PM David Arthur
> > > > > > > > > > > <david.art...@confluent.io.invalid> wrote:
> > > > > > > > > > >
> > > > > > > > > > > > 30. I think we can keep the single ControllerId
> field in
> > > > > those
> > > > > > > > > requests
> > > > > > > > > > > > since they are only used for fencing (as far as I
> know).
> > > > > > > > Internally,
> > > > > > > > > > the
> > > > > > > > > > > > broker components that handle those requests will
> > > compare the
> > > > > > > > > > > ControllerId
> > > > > > > > > > > > with that of MetadataCache (which is updated via
> UMR).
> > > > > > > > > > > >
> > > > > > > > > > > > The reason we need the separate KRaftControllerId in
> the
> > > > > > > > > UpdateMetadata
> > > > > > > > > > > > code path so that we can have different connection
> > > behavior
> > > > > for a
> > > > > > > > > KRaft
> > > > > > > > > > > > controller vs ZK controller.
> > > > > > > > > > > >
> > > > > > > > > > > > 31. It seems reasonable to keep the MigrationRecord
> in
> > > the
> > > > > > > > snapshot.
> > > > > > > > > I
> > > > > > > > > > > was
> > > > > > > > > > > > thinking the same thing in terms of understanding the
> > > loss
> > > > > for a
> > > > > > > > > > > > migration-after-finalization. However, once a
> snapshot
> > > has
> > > > > been
> > > > > > > > taken
> > > > > > > > > > > that
> > > > > > > > > > > > includes the final MigrationRecord, we can't easily
> see
> > > which
> > > > > > > > records
> > > > > > > > > > > came
> > > > > > > > > > > > after it.
> > > > > > > > > > > >
> > > > > > > > > > > > 32. You're correct, we can just use the modify time
> from
> > > the
> > > > > > > Stat.
> > > > > > > > > The
> > > > > > > > > > > > other two fields are primarily informational and are
> > > there
> > > > > for
> > > > > > > > > > operators
> > > > > > > > > > > > who want to inspect the state of the migration. They
> > > aren't
> > > > > > > > required
> > > > > > > > > > for
> > > > > > > > > > > > correctness
> > > > > > > > > > > >
> > > > > > > > > > > > 33. Yes that's right. I detail that in "Controller
> > > > > Leadership"
> > > > > > > > > section
> > > > > > > > > > > >
> > > > > > > > > > > > 34. Right, I'll fix that.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > David
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, Oct 31, 2022 at 2:55 PM Jun Rao
> > > > > <j...@confluent.io.invalid
> > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi, David,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks for the updated KIP. A few more comments.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 30. LeaderAndIsrRequest/StopReplicaRequest both
> have a
> > > > > > > > controllerId
> > > > > > > > > > > > field.
> > > > > > > > > > > > > Should we add a KRaftControllerId field like
> > > > > UpdateMetadata?
> > > > > > > > > > > > >
> > > > > > > > > > > > > 31. "If a migration has been finalized, but the
> KRaft
> > > > > quroum
> > > > > > > > comes
> > > > > > > > > up
> > > > > > > > > > > > with
> > > > > > > > > > > > > kafka.metadata.migration.enable, we must not
> re-enter
> > > the
> > > > > > > > migration
> > > > > > > > > > > mode.
> > > > > > > > > > > > > In this case, while replaying the log, the
> controller
> > > can
> > > > > see
> > > > > > > the
> > > > > > > > > > > second
> > > > > > > > > > > > > MigrationRecord and know that the migration is
> > > finalized
> > > > > and
> > > > > > > > should
> > > > > > > > > > not
> > > > > > > > > > > > be
> > > > > > > > > > > > > resumed. " Hmm, do we want to keep the
> MigrationRecord
> > > in
> > > > > the
> > > > > > > > > > snapshot
> > > > > > > > > > > > and
> > > > > > > > > > > > > the metadata log forever after migration is
> finalized?
> > > If
> > > > > not,
> > > > > > > we
> > > > > > > > > > can't
> > > > > > > > > > > > > know for sure whether a migration has happened or
> not.
> > > > > Also, it
> > > > > > > > > might
> > > > > > > > > > > be
> > > > > > > > > > > > > useful to support switching back to ZK mode after
> the
> > > > > migration
> > > > > > > > is
> > > > > > > > > > > > > finalized, with the understanding of potential
> metadata
> > > > > loss.
> > > > > > > In
> > > > > > > > > that
> > > > > > > > > > > > case,
> > > > > > > > > > > > > we could just trim all metadata log and recopy the
> ZK
> > > > > metadata
> > > > > > > > > back.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 32. The /migration node in ZK: Do we need
> > > > > last_update_time_ms
> > > > > > > > since
> > > > > > > > > > ZK
> > > > > > > > > > > > > Stats already has an MTime? Also, how do we plan to
> > > use the
> > > > > > > > > > > > > kraft_controller_id and kraft_controller_epoch
> fields?
> > > > > > > > > > > > >
> > > > > > > > > > > > > 33. Controller migration: We will force a write to
> the
> > > > > > > > > "/controller"
> > > > > > > > > > > and
> > > > > > > > > > > > > "/controller_epoch" ZNodes before copying ZK data,
> > > right?
> > > > > > > > > > > > >
> > > > > > > > > > > > > 34. "Operator can remove the persistent
> "/controller"
> > > and
> > > > > > > > > > > > > "/controller_epoch" nodes allowing for ZK
> controller
> > > > > election
> > > > > > > to
> > > > > > > > > take
> > > > > > > > > > > > > place". I guess the operator only needs to remove
> the
> > > > > > > /controller
> > > > > > > > > > path?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Jun
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Mon, Oct 31, 2022 at 7:17 AM David Arthur
> > > > > > > > > > > > > <david.art...@confluent.io.invalid> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Happy Monday, everyone! I've updated the KIP
> with the
> > > > > > > following
> > > > > > > > > > > > changes:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > * Clarified MetadataType metric usages (broker vs
> > > > > controller)
> > > > > > > > > > > > > > * Added ZkMigrationReady tagged field to
> > > > > ApiVersionsResponse
> > > > > > > > (for
> > > > > > > > > > use
> > > > > > > > > > > > by
> > > > > > > > > > > > > > KRaft controller quorum)
> > > > > > > > > > > > > > * Added MigrationRecord with two states: Started
> and
> > > > > Finished
> > > > > > > > > > > > > > * Documented ZK configs for KRaft controller
> > > > > > > > > > > > > > * Simplified state machine description
> (internally,
> > > more
> > > > > > > states
> > > > > > > > > > will
> > > > > > > > > > > > > exist,
> > > > > > > > > > > > > > but only the four documented are interesting to
> > > > > operators)
> > > > > > > > > > > > > > * Clarified some things in Controller Migration
> > > section
> > > > > > > > > > > > > > * Removed KRaft -> ZK parts of Broker
> Registration
> > > > > > > > > > > > > > * Added Misconfigurations section to Failure
> Modes
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Let me know if I've missed anything from the
> past two
> > > > > weeks
> > > > > > > of
> > > > > > > > > > > > > discussion.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks again to everyone who has reviewed this
> KIP so
> > > > > far!
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > -David
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Fri, Oct 28, 2022 at 2:26 PM Jun Rao
> > > > > > > > <j...@confluent.io.invalid
> > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi, David,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks for the reply.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 20/21. Sounds good.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Could you update the doc with all the changes
> being
> > > > > > > > discussed?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Fri, Oct 28, 2022 at 10:11 AM David Arthur
> > > > > > > > > > > > > > > <david.art...@confluent.io.invalid> wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Jun,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 20/21. I was also wondering about a
> "migration"
> > > > > record.
> > > > > > > In
> > > > > > > > > > > addition
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > scenario you mentioned, we also need a way to
> > > > > prevent the
> > > > > > > > > > cluster
> > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > re-entering the dual write mode after the
> > > migration
> > > > > has
> > > > > > > > been
> > > > > > > > > > > > > > finalized. I
> > > > > > > > > > > > > > > > could see this happening inadvertently via a
> > > change
> > > > > in
> > > > > > > some
> > > > > > > > > > > > > > configuration
> > > > > > > > > > > > > > > > management system. How about we add a record
> that
> > > > > marks
> > > > > > > the
> > > > > > > > > > > > beginning
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > end of the dual-write mode. The first
> occurrence
> > > of
> > > > > the
> > > > > > > > > record
> > > > > > > > > > > > could
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > included in the metadata transaction when we
> > > migrate
> > > > > data
> > > > > > > > > from
> > > > > > > > > > > ZK.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > With this, the active controller would decide
> > > > > whether to
> > > > > > > > > enter
> > > > > > > > > > > dual
> > > > > > > > > > > > > > write
> > > > > > > > > > > > > > > > mode, finalize the migration based, or fail
> > > based on:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > * Metadata log state
> > > > > > > > > > > > > > > > * It's own configuration
> > > > > > > > ("kafka.metadata.migration.enable",
> > > > > > > > > > > > > > > > "zookeeper.connect", etc)
> > > > > > > > > > > > > > > > * The other controllers configuration (via
> > > > > > > > > ApiVersionsResponse)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > WDYT?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 22. Since we will need the fencing anyways
> as a
> > > > > > > safe-guard,
> > > > > > > > > > then
> > > > > > > > > > > I
> > > > > > > > > > > > > > agree
> > > > > > > > > > > > > > > > would could skip the registration of KRaft
> > > brokers
> > > > > in ZK
> > > > > > > to
> > > > > > > > > > > simply
> > > > > > > > > > > > > > > things a
> > > > > > > > > > > > > > > > bit.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > David
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Thu, Oct 27, 2022 at 5:11 PM Jun Rao
> > > > > > > > > > <j...@confluent.io.invalid
> > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hi, David,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks for the reply.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 20/21. Relying upon the presence of ZK
> configs
> > > to
> > > > > > > > determine
> > > > > > > > > > > > whether
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > KRaft controller is in a dual write mode
> seems
> > > a
> > > > > bit
> > > > > > > > error
> > > > > > > > > > > prone.
> > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > > someone accidentally adds a ZK
> configuration
> > > to a
> > > > > brand
> > > > > > > > new
> > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > cluster,
> > > > > > > > > > > > > > > > > ideally it shouldn't cause the controller
> to
> > > get
> > > > > into a
> > > > > > > > > weird
> > > > > > > > > > > > > state.
> > > > > > > > > > > > > > > Have
> > > > > > > > > > > > > > > > > we considered storing the migration state
> in a
> > > > > metadata
> > > > > > > > > > record?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 22. If we have the broker fencing logic,
> do we
> > > > > need to
> > > > > > > > > write
> > > > > > > > > > > the
> > > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > registration path in ZK for KRaft brokers
> at
> > > all?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Thu, Oct 27, 2022 at 1:02 PM David
> Arthur
> > > > > > > > > > > > > > > > > <david.art...@confluent.io.invalid> wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Jun,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 20/21. A KRaft controller will recover
> the
> > > > > migration
> > > > > > > > > state
> > > > > > > > > > by
> > > > > > > > > > > > > > reading
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > "/migration" ZNode. If the migration
> enable
> > > > > config is
> > > > > > > > > set,
> > > > > > > > > > > and
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > migration is complete, it will enter the
> > > > > dual-write
> > > > > > > > mode.
> > > > > > > > > > > > Before
> > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > operator can decommission ZK, they will
> need
> > > to
> > > > > > > > finalize
> > > > > > > > > > the
> > > > > > > > > > > > > > > migration
> > > > > > > > > > > > > > > > > > which involves removing the migration
> config
> > > and
> > > > > the
> > > > > > > ZK
> > > > > > > > > > > config.
> > > > > > > > > > > > > > I'll
> > > > > > > > > > > > > > > > > > clarify this in the KIP.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 22. Yea, we could see an incorrect
> broker ID
> > > > > during
> > > > > > > > that
> > > > > > > > > > > > window.
> > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > ended up with a state where we saw a ZK
> > > broker ID
> > > > > > > that
> > > > > > > > > > > > conflicted
> > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > KRaft broker ID, we would need to fence
> one
> > > of
> > > > > them.
> > > > > > > I
> > > > > > > > > > would
> > > > > > > > > > > > > > probably
> > > > > > > > > > > > > > > > opt
> > > > > > > > > > > > > > > > > > to fence the KRaft broker in that case
> since
> > > > > broker
> > > > > > > > > > > > registration
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > fencing is more robust in KRaft.
> Hopefully
> > > this
> > > > > is a
> > > > > > > > rare
> > > > > > > > > > > case.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 26. Sounds good.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks!
> > > > > > > > > > > > > > > > > > David
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Thu, Oct 27, 2022 at 1:34 PM Jun Rao
> > > > > > > > > > > > <j...@confluent.io.invalid
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Hi, David,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thanks for the reply. A few more
> comments.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 20/21. Using a tagged field in
> > > > > ApiVersionRequest
> > > > > > > > could
> > > > > > > > > > > work.
> > > > > > > > > > > > > > > Related
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > this, how does a KRaft controller know
> that
> > > > > it's in
> > > > > > > > the
> > > > > > > > > > > dual
> > > > > > > > > > > > > > write
> > > > > > > > > > > > > > > > > mode?
> > > > > > > > > > > > > > > > > > > Does it need to read the /controller
> path
> > > from
> > > > > ZK?
> > > > > > > > > After
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > migration,
> > > > > > > > > > > > > > > > > > > people may have the ZK cluster
> > > decommissioned,
> > > > > but
> > > > > > > > > still
> > > > > > > > > > > have
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > > configs left in the KRaft controller.
> Will
> > > this
> > > > > > > cause
> > > > > > > > > the
> > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > > to be stuck because it doesn't know
> which
> > > mode
> > > > > it
> > > > > > > is
> > > > > > > > > in?
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 22. Using the ephemeral node matches
> the
> > > > > current
> > > > > > > > > ZK-based
> > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > behavior
> > > > > > > > > > > > > > > > > > > better. However, it leaves a window for
> > > > > incorrect
> > > > > > > > > broker
> > > > > > > > > > > > > > > registration
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > sneak in during KRaft controller
> failover.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 26. Then, we could just remove Broker
> > > > > Registration
> > > > > > > in
> > > > > > > > > > that
> > > > > > > > > > > > > > section.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Wed, Oct 26, 2022 at 2:21 PM David
> > > Arthur
> > > > > > > > > > > > > > > > > > > <david.art...@confluent.io.invalid>
> wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 20/21 It could definitely cause
> problems
> > > if
> > > > > we
> > > > > > > > > failover
> > > > > > > > > > > to
> > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > > > without
> > > "kafka.metadata.migration.enable".
> > > > > The
> > > > > > > only
> > > > > > > > > > > > > mechanism I
> > > > > > > > > > > > > > > > know
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > controllers to learn things about one
> > > > > another is
> > > > > > > > > > > > ApiVersions.
> > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > > > currently
> > > > > > > > > > > > > > > > > > > > use this for checking support for
> > > > > > > > "metadata.version"
> > > > > > > > > > (in
> > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > mode).
> > > > > > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > > > > could add a "zk.migration" feature
> flag
> > > > > that's
> > > > > > > > > enabled
> > > > > > > > > > > on a
> > > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > > > if the config is set. Another
> possibility
> > > > > would
> > > > > > > be
> > > > > > > > a
> > > > > > > > > > > tagged
> > > > > > > > > > > > > > field
> > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > ApiVersionResponse that indicated if
> the
> > > > > config
> > > > > > > was
> > > > > > > > > > set.
> > > > > > > > > > > > Both
> > > > > > > > > > > > > > > seem
> > > > > > > > > > > > > > > > > > > somewhat
> > > > > > > > > > > > > > > > > > > > inelegant. I think a tagged field
> would
> > > be a
> > > > > bit
> > > > > > > > > > simpler
> > > > > > > > > > > > (and
> > > > > > > > > > > > > > > > > arguably
> > > > > > > > > > > > > > > > > > > less
> > > > > > > > > > > > > > > > > > > > hacky).
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > For 20, we could avoid entering the
> > > migration
> > > > > > > state
> > > > > > > > > > > unless
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > whole
> > > > > > > > > > > > > > > > > > > quorum
> > > > > > > > > > > > > > > > > > > > had the field present in their
> > > > > NodeApiVersions.
> > > > > > > For
> > > > > > > > > 21,
> > > > > > > > > > > we
> > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > avoid
> > > > > > > > > > > > > > > > > > > > leaving the migration state unless
> the
> > > whole
> > > > > > > quorum
> > > > > > > > > did
> > > > > > > > > > > not
> > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > field
> > > > > > > > > > > > > > > > > > > > in their NodeApiVersions. Do you
> think
> > > this
> > > > > would
> > > > > > > > be
> > > > > > > > > > > > > > sufficient?
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 22. Right, we need to write the
> broker
> > > info
> > > > > back
> > > > > > > to
> > > > > > > > > ZK
> > > > > > > > > > > just
> > > > > > > > > > > > > as
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > safeguard
> > > > > > > > > > > > > > > > > > > > against incorrect broker IDs getting
> > > > > registered
> > > > > > > > into
> > > > > > > > > > ZK.
> > > > > > > > > > > I
> > > > > > > > > > > > > was
> > > > > > > > > > > > > > > > > thinking
> > > > > > > > > > > > > > > > > > > > these would be persistent nodes, but
> it's
> > > > > > > probably
> > > > > > > > > fine
> > > > > > > > > > > to
> > > > > > > > > > > > > make
> > > > > > > > > > > > > > > > them
> > > > > > > > > > > > > > > > > > > > ephemeral and have the active KRaft
> > > > > controller
> > > > > > > keep
> > > > > > > > > > them
> > > > > > > > > > > up
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > date.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 23. Right. When the broker comes up
> as a
> > > > > KRaft
> > > > > > > > > broker,
> > > > > > > > > > > it's
> > > > > > > > > > > > > old
> > > > > > > > > > > > > > > > > > > > /brokers/ids ZNode will be gone and
> it
> > > will
> > > > > > > > register
> > > > > > > > > > > itself
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > > controller. The controller will know
> > > that it
> > > > > is
> > > > > > > now
> > > > > > > > > in
> > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > mode
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > > stop sending it the old RPCs.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 24. Ok, I'll add these
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 25. I realize now the
> "/controller_epoch"
> > > > > node is
> > > > > > > > > > already
> > > > > > > > > > > > > > > > persistent.
> > > > > > > > > > > > > > > > > > It
> > > > > > > > > > > > > > > > > > > > should be sufficient to remove the
> > > > > "/controller"
> > > > > > > > node
> > > > > > > > > > to
> > > > > > > > > > > > > > trigger
> > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > election.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 26. Hmm, not sure, but I don't think
> it
> > > uses
> > > > > > > > watches.
> > > > > > > > > > > > > > KafkaServer
> > > > > > > > > > > > > > > > > > > registers
> > > > > > > > > > > > > > > > > > > > the broker info and later loads all
> > > brokers
> > > > > in
> > > > > > > the
> > > > > > > > > > > cluster
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > check
> > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > the endpoints don't conflict. Things
> > > that do
> > > > > use
> > > > > > > > > > watches
> > > > > > > > > > > > are
> > > > > > > > > > > > > > > > dynamic
> > > > > > > > > > > > > > > > > > > > configs, ACLs, and some others
> (listed
> > > in the
> > > > > > > KIP).
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Wed, Oct 26, 2022 at 4:26 PM David
> > > Arthur
> > > > > <
> > > > > > > > > > > > > > > > > > david.art...@confluent.io>
> > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Luke and Andrew, thanks for taking
> a
> > > look!
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > I think the names of the state
> machine
> > > in
> > > > > the
> > > > > > > KIP
> > > > > > > > > > > aren't
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > best.
> > > > > > > > > > > > > > > > > > I'll
> > > > > > > > > > > > > > > > > > > > > try to improve that section in
> general.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 1. "MigrationReady" is probably
> better
> > > > > named
> > > > > > > > > > > > > > "MigratingFromZk"
> > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > > > > something. It's meant to be the
> state
> > > when
> > > > > the
> > > > > > > > > KRaft
> > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > actively
> > > > > > > > > > > > > > > > > > > > > migrating the data out of ZK. This
> > > happens
> > > > > > > after
> > > > > > > > we
> > > > > > > > > > > have
> > > > > > > > > > > > > > > detected
> > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > cluster is eligible and before we
> > > enter the
> > > > > > > dual
> > > > > > > > > > write
> > > > > > > > > > > > > mode.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 2. "MigrationActive" should
> probably be
> > > > > called
> > > > > > > > > > > > > > > "BrokerMigration".
> > > > > > > > > > > > > > > > > > > > > Transitioning to
> "MigrationFinished"
> > > can
> > > > > happen
> > > > > > > > > > > > > automatically
> > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > known ZK brokers have been
> migrated.
> > > Since
> > > > > we
> > > > > > > > don't
> > > > > > > > > > > have
> > > > > > > > > > > > > > > > permanent
> > > > > > > > > > > > > > > > > > > > > registrations of ZK brokers, we can
> > > use the
> > > > > > > > > partition
> > > > > > > > > > > > > > > assignments
> > > > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > proxy for this.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 3. A metric for unready brokers
> makes
> > > > > sense. We
> > > > > > > > can
> > > > > > > > > > > also
> > > > > > > > > > > > > log
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > message
> > > > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > > the controller when it tries to
> start
> > > the
> > > > > > > > migration
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 4. The MetadataType metric is
> meant to
> > > > > help see
> > > > > > > > > this.
> > > > > > > > > > > > E.g.,
> > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > > brokers
> > > > > > > > > > > > > > > > > > > > > would have MetadataType=ZK, some
> would
> > > have
> > > > > > > > > > > > > MetadataType=Dual
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 5. On ZK brokers, not having this
> > > config
> > > > > set
> > > > > > > > would
> > > > > > > > > > > > prevent
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > > > > > registration data from being
> written
> > > to ZK.
> > > > > > > This
> > > > > > > > > > means
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > > > > won't send RPCs to it. If a broker
> has
> > > been
> > > > > > > > > migrated
> > > > > > > > > > to
> > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > already,
> > > > > > > > > > > > > > > > > > > > I'm
> > > > > > > > > > > > > > > > > > > > > not sure there is any harm in
> removing
> > > this
> > > > > > > > config.
> > > > > > > > > > If
> > > > > > > > > > > we
> > > > > > > > > > > > > > > decide
> > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > > need to guarantee that KRaft
> brokers
> > > have
> > > > > this
> > > > > > > > > config
> > > > > > > > > > > set
> > > > > > > > > > > > > > > during
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > migration, we can include it in the
> > > broker
> > > > > > > > > > registration
> > > > > > > > > > > > > > that's
> > > > > > > > > > > > > > > > sent
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > KRaft. This would let the
> controller
> > > keep
> > > > > that
> > > > > > > > > broker
> > > > > > > > > > > > > fenced.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 6. Once the KRaft controller takes
> > > > > leadership,
> > > > > > > > the
> > > > > > > > > ZK
> > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > won't
> > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > active any more and will stop
> > > reporting the
> > > > > > > > > > > > MigrationState
> > > > > > > > > > > > > > > > metric.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 7. The "Dual" MetadataType is
> reported
> > > by
> > > > > > > brokers
> > > > > > > > > > > running
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > mode
> > > > > > > > > > > > > > > > > > > > > when the migration enable config is
> > > set. I
> > > > > > > think
> > > > > > > > > the
> > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > > > > also
> > > > > > > > > > > > > > > > > > > > > report this, not just the brokers.
> I'll
> > > > > clarify
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > KIP.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Andrew:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > How will the code get all the
> > > ZooKeeper
> > > > > > > config?
> > > > > > > > > > Will
> > > > > > > > > > > it
> > > > > > > > > > > > > do
> > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > sort
> > > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > scan of the ZooKeeper data store?
> ...
> > > Do we
> > > > > > > need
> > > > > > > > to
> > > > > > > > > > do
> > > > > > > > > > > > > > anything
> > > > > > > > > > > > > > > > > > special
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > make sure we get all data from
> > > ZooKeeper?
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > With a few exceptions, all data
> > > written to
> > > > > ZK
> > > > > > > by
> > > > > > > > > > Kafka
> > > > > > > > > > > > > > happens
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > controller (single writer
> principle).
> > > In
> > > > > our
> > > > > > > > > > migration
> > > > > > > > > > > > > code,
> > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > > > unconditionally update the
> > > "/controller"
> > > > > and
> > > > > > > > > > > > > > > "/controller_epoch"
> > > > > > > > > > > > > > > > > > ZNodes
> > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > effectively force the migration
> > > component
> > > > > to
> > > > > > > gain
> > > > > > > > > > > > > leadership
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > > > > controller. Once this happens, we
> don't
> > > > > expect
> > > > > > > > any
> > > > > > > > > > data
> > > > > > > > > > > > to
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > written
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > ZK, so we can read it iteratively
> > > without
> > > > > > > > worrying
> > > > > > > > > > > about
> > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > concurrent
> > > > > > > > > > > > > > > > > > > > > writes.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > As for the linearizable thing, I
> > > believe
> > > > > this
> > > > > > > > just
> > > > > > > > > > > means
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > reads
> > > > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > served by a quorum follower which
> has
> > > stale
> > > > > > > data.
> > > > > > > > > > We'll
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > > reading
> > > > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > > > > the same way the ZK controller
> does,
> > > so I
> > > > > think
> > > > > > > > we
> > > > > > > > > > will
> > > > > > > > > > > > be
> > > > > > > > > > > > > > fine
> > > > > > > > > > > > > > > > > > > regarding
> > > > > > > > > > > > > > > > > > > > > consistency. If we wanted to be
> extra
> > > > > careful,
> > > > > > > we
> > > > > > > > > > could
> > > > > > > > > > > > > add a
> > > > > > > > > > > > > > > > delay
> > > > > > > > > > > > > > > > > > > prior
> > > > > > > > > > > > > > > > > > > > > to iterating through the znodes to
> give
> > > > > > > > partitioned
> > > > > > > > > > ZK
> > > > > > > > > > > > > > > followers
> > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > chance
> > > > > > > > > > > > > > > > > > > > > to get kicked out of the quorum. I
> > > don't
> > > > > think
> > > > > > > > > we'll
> > > > > > > > > > > need
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > though
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > What happens if we commit the
> > > transaction
> > > > > > > then
> > > > > > > > > fail
> > > > > > > > > > > > right
> > > > > > > > > > > > > > > after
> > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > restart?
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > If we commit the "migration"
> > > transaction
> > > > > to the
> > > > > > > > > > > metadata
> > > > > > > > > > > > > log,
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > won't
> > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > a problem if we failover or
> restart.
> > > We can
> > > > > > > > recover
> > > > > > > > > > our
> > > > > > > > > > > > > state
> > > > > > > > > > > > > > > > based
> > > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > metadata log and what exists in
> ZK. If
> > > we
> > > > > see a
> > > > > > > > > > > migration
> > > > > > > > > > > > > > > > > transaction
> > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > the kraft log, but no "/migration"
> > > ZNode,
> > > > > we'll
> > > > > > > > > know
> > > > > > > > > > we
> > > > > > > > > > > > > > failed
> > > > > > > > > > > > > > > > > before
> > > > > > > > > > > > > > > > > > > > > writing to ZK. If we see a partial
> > > > > transaction
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > > log,
> > > > > > > > > > > > > > then
> > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > > > abort it and restart the migration.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > This sort of leads to me
> wondering
> > > if we
> > > > > > > > > > will/should
> > > > > > > > > > > > > check
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > metadata
> > > > > > > > > > > > > > > > > > > > > log state before doing the
> migration?
> > > That
> > > > > > > could
> > > > > > > > > also
> > > > > > > > > > > > catch
> > > > > > > > > > > > > > > > > mistakes
> > > > > > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > > > > as if a KRaft quorum is used that
> > > already
> > > > > had
> > > > > > > > some
> > > > > > > > > > > > metadata
> > > > > > > > > > > > > > > > which I
> > > > > > > > > > > > > > > > > > > > assume
> > > > > > > > > > > > > > > > > > > > > we want to prevent.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Yes, we should check that the
> metadata
> > > log
> > > > > is
> > > > > > > > empty
> > > > > > > > > > > > before
> > > > > > > > > > > > > > > > > > attempting a
> > > > > > > > > > > > > > > > > > > > > migration (perhaps excepting the
> > > bootstrap
> > > > > > > > metadata
> > > > > > > > > > --
> > > > > > > > > > > > need
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > > that one still).
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > On Mon, Oct 24, 2022 at 8:57 AM
> Andrew
> > > > > Grant
> > > > > > > > > > > > > > > > > > > <agr...@confluent.io.invalid
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> Hey David,
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> Thanks for the KIP. I had a few
> small
> > > > > > > questions.
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> “The ZK data migration will copy
> the
> > > > > existing
> > > > > > > ZK
> > > > > > > > > > data
> > > > > > > > > > > > into
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > > >> metadata log and establish the new
> > > KRaft
> > > > > > > active
> > > > > > > > > > > > controller
> > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > active
> > > > > > > > > > > > > > > > > > > > >> controller from a ZK perspective.”
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> How will the code get all the
> > > ZooKeeper
> > > > > > > config?
> > > > > > > > > Will
> > > > > > > > > > > it
> > > > > > > > > > > > do
> > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > sort
> > > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > >> scan of the ZooKeeper data store?
> > > Also I’m
> > > > > > > not a
> > > > > > > > > > > > ZooKeeper
> > > > > > > > > > > > > > > > expert
> > > > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > > > per
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >
> > > > > > > > > >
> > > https://zookeeper.apache.org/doc/current/zookeeperInternals.html
> > > > > > > > > > > > > > > > > > > “Read
> > > > > > > > > > > > > > > > > > > > >> operations in ZooKeeper are *not
> > > > > linearizable*
> > > > > > > > > since
> > > > > > > > > > > > they
> > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > return
> > > > > > > > > > > > > > > > > > > > >> potentially stale data.” Do we
> need
> > > to do
> > > > > > > > anything
> > > > > > > > > > > > special
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > make
> > > > > > > > > > > > > > > > > > > sure
> > > > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > >> get all data from ZooKeeper?
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> “For the initial migration, the
> > > controller
> > > > > > > will
> > > > > > > > > > > utilize
> > > > > > > > > > > > > > > KIP-868
> > > > > > > > > > > > > > > > > > > Metadata
> > > > > > > > > > > > > > > > > > > > >> Transactions
> > > > > > > > > > > > > > > > > > > > >> <
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-868+Metadata+Transactions
> > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > > > > >> write all of the ZK metadata in a
> > > single
> > > > > > > > > > transaction.
> > > > > > > > > > > If
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > > > >> fails before this transaction is
> > > > > finalized,
> > > > > > > the
> > > > > > > > > next
> > > > > > > > > > > > > active
> > > > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > > > >> will
> > > > > > > > > > > > > > > > > > > > >> abort the transaction and restart
> the
> > > > > > > migration
> > > > > > > > > > > > process.”
> > > > > > > > > > > > > > What
> > > > > > > > > > > > > > > > > > happens
> > > > > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > > >> we commit the transaction then
> fail
> > > right
> > > > > > > after
> > > > > > > > > that
> > > > > > > > > > > and
> > > > > > > > > > > > > > > > restart?
> > > > > > > > > > > > > > > > > > This
> > > > > > > > > > > > > > > > > > > > >> sort
> > > > > > > > > > > > > > > > > > > > >> of leads to me wondering if we
> > > will/should
> > > > > > > check
> > > > > > > > > the
> > > > > > > > > > > > > > metadata
> > > > > > > > > > > > > > > > log
> > > > > > > > > > > > > > > > > > > state
> > > > > > > > > > > > > > > > > > > > >> before doing the migration? That
> could
> > > > > also
> > > > > > > > catch
> > > > > > > > > > > > mistakes
> > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > >> KRaft quorum is used that already
> had
> > > some
> > > > > > > > > metadata
> > > > > > > > > > > > which
> > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > assume
> > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > want
> > > > > > > > > > > > > > > > > > > > >> to prevent.
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> For the Test Plan section do you
> think
> > > > > it’s
> > > > > > > > worth
> > > > > > > > > > > > calling
> > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > > > > > performance
> > > > > > > > > > > > > > > > > > > > >> testing migrations of ZooKeeper
> > > > > deployments
> > > > > > > that
> > > > > > > > > > have
> > > > > > > > > > > > > > “large”
> > > > > > > > > > > > > > > > > > amounts
> > > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > >> metadata?
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> Thanks,
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> Andrew
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> On Mon, Oct 24, 2022 at 3:20 AM
> Luke
> > > Chen
> > > > > <
> > > > > > > > > > > > > > show...@gmail.com>
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> > Hi David,
> > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > >> > Thanks for the good and
> complicated
> > > > > > > proposal!
> > > > > > > > :)
> > > > > > > > > > > > > > > > > > > > >> > Some questions:
> > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > >> > 1. "MigrationReady" state: The
> KIP
> > > said:
> > > > > > > > > > > > > > > > > > > > >> > The KRaft quorum has been
> started
> > > > > > > > > > > > > > > > > > > > >> > I don't think we'll
> automatically
> > > enter
> > > > > this
> > > > > > > > > state
> > > > > > > > > > > > after
> > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > quorum
> > > > > > > > > > > > > > > > > > > > >> > started, do we?
> > > > > > > > > > > > > > > > > > > > >> > I think KRaft active controller
> > > should
> > > > > take
> > > > > > > > over
> > > > > > > > > > > > > > leadership
> > > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > > > > > writing a
> > > > > > > > > > > > > > > > > > > > >> > znode in /controller path
> before we
> > > > > entering
> > > > > > > > > this
> > > > > > > > > > > > sate,
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > correct?
> > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > >> > 2. "MigrationActive" state: the
> KIP
> > > > > said:
> > > > > > > > > > > > > > > > > > > > >> > ZK state has been migrated,
> > > controller
> > > > > is in
> > > > > > > > > > > > dual-write
> > > > > > > > > > > > > > > mode,
> > > > > > > > > > > > > > > > > > > brokers
> > > > > > > > > > > > > > > > > > > > >> are
> > > > > > > > > > > > > > > > > > > > >> > being restarted in KRaft mode
> > > > > > > > > > > > > > > > > > > > >> > What confuses me is the last
> part:
> > > > > "brokers
> > > > > > > > are
> > > > > > > > > > > being
> > > > > > > > > > > > > > > > restarted
> > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > > >> > mode".
> > > > > > > > > > > > > > > > > > > > >> > How could we detect brokers are
> > > being
> > > > > > > > restarted
> > > > > > > > > in
> > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > mode?
> > > > > > > > > > > > > > > > > Old
> > > > > > > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > > > >> > broker is removed and new KRaft
> > > broker
> > > > > is up
> > > > > > > > > > within
> > > > > > > > > > > N
> > > > > > > > > > > > > > > minutes?
> > > > > > > > > > > > > > > > > > > > >> > I think we don't have to rely
> on the
> > > > > > > condition
> > > > > > > > > > > > "brokers
> > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > being
> > > > > > > > > > > > > > > > > > > > >> restarted
> > > > > > > > > > > > > > > > > > > > >> > in KRaft mode" to enter this
> state.
> > > > > > > > > > > > > > > > > > > > >> > "brokers are being restarted in
> > > KRaft
> > > > > mode"
> > > > > > > > > should
> > > > > > > > > > > be
> > > > > > > > > > > > a
> > > > > > > > > > > > > > > > process
> > > > > > > > > > > > > > > > > > > > happened
> > > > > > > > > > > > > > > > > > > > >> > between "MigrationActive" and
> > > > > > > > > "MigrationFinished".
> > > > > > > > > > > > Does
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > make
> > > > > > > > > > > > > > > > > > > > sense?
> > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > >> > 3. When "Zookeeper" mode trying
> to
> > > enter
> > > > > > > > > > > > > > > "MigrationEligible",
> > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > there
> > > > > > > > > > > > > > > > > > > > >> is a
> > > > > > > > > > > > > > > > > > > > >> > cluster with tens of brokers,
> how
> > > could
> > > > > > > users
> > > > > > > > > know
> > > > > > > > > > > why
> > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > cluster
> > > > > > > > > > > > > > > > > > > > >> cannot
> > > > > > > > > > > > > > > > > > > > >> > be in "MigrationEligible" state,
> > > yet?
> > > > > Check
> > > > > > > > that
> > > > > > > > > > > znode
> > > > > > > > > > > > > > > > manually
> > > > > > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > > > > > >> one?
> > > > > > > > > > > > > > > > > > > > >> > Or do we plan to create a tool
> to
> > > help
> > > > > them?
> > > > > > > > Or
> > > > > > > > > > > maybe
> > > > > > > > > > > > > > expose
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> "unready
> > > > > > > > > > > > > > > > > > > > >> > ZK brokers" metrics?
> > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > >> > 4. Same for "MigrationActive"
> > > entering
> > > > > > > > > > > > > "MigrationFinished"
> > > > > > > > > > > > > > > > > state.
> > > > > > > > > > > > > > > > > > > > Since
> > > > > > > > > > > > > > > > > > > > >> > that will be some process for
> > > > > restarting ZK
> > > > > > > > > broker
> > > > > > > > > > > > into
> > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > > > > >> one by
> > > > > > > > > > > > > > > > > > > > >> > one, could we expose the
> "remaining
> > > ZK
> > > > > > > > brokers"
> > > > > > > > > as
> > > > > > > > > > > > > > metrics?
> > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > >> > 5. When users are in
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > "MigrationReady"/"MigrationActive"/"MigrationFinished"
> > > > > > > > > > > > > > > > > > > > >> > states, and they accidentally
> > > change the
> > > > > > > KRaft
> > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > config:
> > > > > > > > > > > > > > > > > > > > >> >
> "kafka.metadata.migration.enable"
> > > > > > > > > > > > > > > > > > > > >> > to false. What will happen to
> this
> > > > > cluster?
> > > > > > > No
> > > > > > > > > > > > > dual-write
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > it?
> > > > > > > > > > > > > > > > > > > Will
> > > > > > > > > > > > > > > > > > > > >> we
> > > > > > > > > > > > > > > > > > > > >> > have any protection for it?
> > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > >> > 6. About the "MigrationState"
> > > metric:
> > > > > > > > > > > > > > > > > > > > >> > The "ZooKeeper" and
> > > "MigrationEligible"
> > > > > is
> > > > > > > > > > reported
> > > > > > > > > > > by
> > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > > controller,
> > > > > > > > > > > > > > > > > > > > >> and
> > > > > > > > > > > > > > > > > > > > >> > the rest of states are reported
> by
> > > KRaft
> > > > > > > > > > controller
> > > > > > > > > > > > -->
> > > > > > > > > > > > > > > makes
> > > > > > > > > > > > > > > > > > sense
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> me.
> > > > > > > > > > > > > > > > > > > > >> > One question from it is, when
> KRaft
> > > > > > > controller
> > > > > > > > > > takes
> > > > > > > > > > > > > over
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > leadership
> > > > > > > > > > > > > > > > > > > > >> > from ZK controller, what will
> the
> > > > > > > > > "MigrationState"
> > > > > > > > > > > > value
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > old
> > > > > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > > > >> > controller? keep in
> > > "MigrationEligible"
> > > > > > > > doesn't
> > > > > > > > > > make
> > > > > > > > > > > > > > sense.
> > > > > > > > > > > > > > > > Will
> > > > > > > > > > > > > > > > > > > there
> > > > > > > > > > > > > > > > > > > > >> be a
> > > > > > > > > > > > > > > > > > > > >> > empty or null state?
> > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > >> > 7. About the "MetadataType"
> metric:
> > > > > > > > > > > > > > > > > > > > >> > An enumeration of: ZooKeeper
> (1),
> > > Dual
> > > > > (2),
> > > > > > > > > KRaft
> > > > > > > > > > > (3).
> > > > > > > > > > > > > > Each
> > > > > > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > > > > >> reports
> > > > > > > > > > > > > > > > > > > > >> > this.
> > > > > > > > > > > > > > > > > > > > >> > I don't know how we could map
> the
> > > > > migration
> > > > > > > > > state
> > > > > > > > > > to
> > > > > > > > > > > > > > these 3
> > > > > > > > > > > > > > > > > > types.
> > > > > > > > > > > > > > > > > > > > >> > What is the metadataType when
> > > cluster in
> > > > > > > > > > > > > "MigrationReady"
> > > > > > > > > > > > > > > > state?
> > > > > > > > > > > > > > > > > > > Still
> > > > > > > > > > > > > > > > > > > > >> > Zookeeper?
> > > > > > > > > > > > > > > > > > > > >> > When will brokers enter Dual
> type?
> > > > > > > > > > > > > > > > > > > > >> > This is unclear in the KIP.
> > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > >> > Thank you.
> > > > > > > > > > > > > > > > > > > > >> > Luke
> > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > >> > On Thu, Oct 20, 2022 at 11:33 PM
> > > David
> > > > > > > Arthur
> > > > > > > > > > > > > > > > > > > > >> > <david.art...@confluent.io
> .invalid>
> > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > >> > > Igor, thanks for taking a
> look!
> > > Since
> > > > > JBOD
> > > > > > > > in
> > > > > > > > > > > KRaft
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > still
> > > > > > > > > > > > > > > > > > under
> > > > > > > > > > > > > > > > > > > > >> > > discussion and not likely to
> land
> > > > > before
> > > > > > > the
> > > > > > > > > ZK
> > > > > > > > > > > > > > > migration, I
> > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > >> we'll
> > > > > > > > > > > > > > > > > > > > >> > > need to defer it. For
> migrating
> > > JBOD
> > > > > > > > clusters
> > > > > > > > > > from
> > > > > > > > > > > > ZK
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > KRaft,
> > > > > > > > > > > > > > > > > > > > we'll
> > > > > > > > > > > > > > > > > > > > >> > also
> > > > > > > > > > > > > > > > > > > > >> > > need to address the log dir
> > > failure
> > > > > > > > mechanism
> > > > > > > > > > > which
> > > > > > > > > > > > > > > > currently
> > > > > > > > > > > > > > > > > > > uses a
> > > > > > > > > > > > > > > > > > > > >> > > special ZNode written to by
> the
> > > > > brokers.
> > > > > > > > There
> > > > > > > > > > is
> > > > > > > > > > > an
> > > > > > > > > > > > > old
> > > > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-589+Add+API+to+update+Replica+state+in+Controller
> > > > > > > > > > > > > > > > > > > > >> > > which proposes a new RPC to
> move
> > > that
> > > > > ZK
> > > > > > > > write
> > > > > > > > > > to
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > controller,
> > > > > > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > > > >> I'm
> > > > > > > > > > > > > > > > > > > > >> > > not sure if that's the
> approach
> > > we'll
> > > > > want
> > > > > > > > to
> > > > > > > > > > > take.
> > > > > > > > > > > > I
> > > > > > > > > > > > > > read
> > > > > > > > > > > > > > > > > > through
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > > discussion over in KIP-858
> and it
> > > > > sounds
> > > > > > > > like
> > > > > > > > > > > there
> > > > > > > > > > > > > are
> > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > good
> > > > > > > > > > > > > > > > > > > > >> ideas
> > > > > > > > > > > > > > > > > > > > >> > > there.
> > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > >> > > To answer your question more
> > > directly,
> > > > > > > > > migrating
> > > > > > > > > > > ZK
> > > > > > > > > > > > > > > clusters
> > > > > > > > > > > > > > > > > > using
> > > > > > > > > > > > > > > > > > > > >> JBOD
> > > > > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > > > > >> > > out of scope for this KIP. It
> > > might be
> > > > > > > > > possible
> > > > > > > > > > to
> > > > > > > > > > > > > stop
> > > > > > > > > > > > > > > > using
> > > > > > > > > > > > > > > > > > JBOD
> > > > > > > > > > > > > > > > > > > > >> using
> > > > > > > > > > > > > > > > > > > > >> > > reassignments, but I'm not
> sure
> > > about
> > > > > > > that.
> > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > >> > > -David
> > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > >> > > On Tue, Oct 18, 2022 at 12:17
> PM
> > > Igor
> > > > > > > > Soarez <
> > > > > > > > > > > > > > i...@soarez.me
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > >> > > > Hi David,
> > > > > > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > > > > > >> > > > Thanks for the KIP, this is
> very
> > > > > > > exciting!
> > > > > > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > > > > > >> > > > How does JBOD relate to this
> > > work?
> > > > > KRaft
> > > > > > > > > mode
> > > > > > > > > > > > > doesn't
> > > > > > > > > > > > > > > yet
> > > > > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > > >> > > > configuring brokers with
> > > multiple
> > > > > log
> > > > > > > > > > > directories.
> > > > > > > > > > > > > If
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > brokers
> > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > > >> > > > existing cluster are
> configured
> > > with
> > > > > > > > > multiple
> > > > > > > > > > > log
> > > > > > > > > > > > > > dirs,
> > > > > > > > > > > > > > > > does
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > > migration
> > > > > > > > > > > > > > > > > > > > >> > > > imply that the existing
> brokers
> > > > > need to
> > > > > > > > drop
> > > > > > > > > > use
> > > > > > > > > > > > of
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > feature?
> > > > > > > > > > > > > > > > > > > > >> Or is
> > > > > > > > > > > > > > > > > > > > >> > > > there some way to upgrade
> them
> > > > > later?
> > > > > > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > > > > > >> > > > Thanks,
> > > > > > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > > > > > >> > > > --
> > > > > > > > > > > > > > > > > > > > >> > > > Igor
> > > > > > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > > > > > >> > > > On Mon, Oct 17, 2022, at
> 10:07
> > > PM,
> > > > > David
> > > > > > > > > > Arthur
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >> > > > > I've updated the KIP with
> the
> > > > > > > following
> > > > > > > > > > > changes
> > > > > > > > > > > > > (the
> > > > > > > > > > > > > > > > > > > confluence
> > > > > > > > > > > > > > > > > > > > >> diff
> > > > > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > > > > >> > > > not
> > > > > > > > > > > > > > > > > > > > >> > > > > helpful here since i
> > > rearranged
> > > > > some
> > > > > > > > > things)
> > > > > > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > > > > * Added
> > > > > ZooKeeperBlockingKRaftMillis
> > > > > > > > > metric
> > > > > > > > > > > > > > > > > > > > >> > > > > * Added section on new
> broker
> > > > > > > > registration
> > > > > > > > > > > JSON
> > > > > > > > > > > > > > > > > > > > >> > > > > * Removed section on
> > > > > MigrationCheck
> > > > > > > RPC
> > > > > > > > > > > > > > > > > > > > >> > > > > * Added change to
> > > > > > > UpdateMetadataRequest
> > > > > > > > > > > > > > > > > > > > >> > > > > * Added section
> "Additional ZK
> > > > > Broker
> > > > > > > > > > Configs"
> > > > > > > > > > > > > > > (includes
> > > > > > > > > > > > > > > > > > > configs
> > > > > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > > > > >> > > > connect
> > > > > > > > > > > > > > > > > > > > >> > > > > to KRaft quorum)
> > > > > > > > > > > > > > > > > > > > >> > > > > * Added section on
> > > "Incompatible
> > > > > > > > Brokers"
> > > > > > > > > > > under
> > > > > > > > > > > > > > > Failure
> > > > > > > > > > > > > > > > > > Modes
> > > > > > > > > > > > > > > > > > > > >> > > > > * Clarified many things
> per
> > > this
> > > > > > > > > discussion
> > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > > > > I realized we need the
> KRaft
> > > > > > > controller
> > > > > > > > to
> > > > > > > > > > > pick
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > correct
> > > > > > > > > > > > > > > > > > > > >> > > > > "metadata.version" when
> > > > > initializing
> > > > > > > the
> > > > > > > > > > > > > migration.
> > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > > > included
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > IBP
> > > > > > > > > > > > > > > > > > > > >> > > > of a
> > > > > > > > > > > > > > > > > > > > >> > > > > broker in its registration
> > > data
> > > > > so the
> > > > > > > > > KRaft
> > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > > verify
> > > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > > >> > > > IBP
> > > > > > > > > > > > > > > > > > > > >> > > > > and pick the correct
> > > > > > > "metadata.version"
> > > > > > > > > when
> > > > > > > > > > > > > > starting
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> migration.
> > > > > > > > > > > > > > > > > > > > >> > > > > Otherwise, we could
> > > inadvertently
> > > > > > > > > downgrade
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> IBP/metadata.version
> > > > > > > > > > > > > > > > > > > > >> > as
> > > > > > > > > > > > > > > > > > > > >> > > > > part of the migration.
> > > > > > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > > > > I also added "clusterId"
> to
> > > the
> > > > > broker
> > > > > > > > > > > > > registration
> > > > > > > > > > > > > > > data
> > > > > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> KRaft
> > > > > > > > > > > > > > > > > > > > >> > > > could
> > > > > > > > > > > > > > > > > > > > >> > > > > verify it.
> > > > > > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > > > > -David
> > > > > > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > > > > On Mon, Oct 17, 2022 at
> 12:14
> > > PM
> > > > > Colin
> > > > > > > > > > McCabe
> > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > cmcc...@apache.org
> > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > >> > > > wrote:
> > > > > > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > > > >> On Fri, Oct 14, 2022, at
> > > 11:39,
> > > > > Jun
> > > > > > > Rao
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >> > > > >> > Hi, Colin,
> > > > > > > > > > > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> > 10. "That all goes
> away in
> > > the
> > > > > new
> > > > > > > > > mode,
> > > > > > > > > > > and
> > > > > > > > > > > > we
> > > > > > > > > > > > > > > just
> > > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > > > >> > code
> > > > > > > > > > > > > > > > > > > > >> > > > which
> > > > > > > > > > > > > > > > > > > > >> > > > >> > analyzes
> > > __cluster_metadata and
> > > > > > > > > reflects
> > > > > > > > > > it
> > > > > > > > > > > > in
> > > > > > > > > > > > > 1)
> > > > > > > > > > > > > > > > > updates
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > > > > > > >> > > 2)
> > > > > > > > > > > > > > > > > > > > >> > > > >> > messages sent out to
> > > brokers."
> > > > > > > > > > > > > > > > > > > > >> > > > >> > Hmm, I am not sure it's
> > > that
> > > > > > > simple.
> > > > > > > > > Some
> > > > > > > > > > > of
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > complexity
> > > > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > > > >> > > > >> ZK-based
> > > > > > > > > > > > > > > > > > > > >> > > > >> > controller are (1)
> using
> > > the
> > > > > > > > pipelining
> > > > > > > > > > > > > approach
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > write
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > > > >> > for
> > > > > > > > > > > > > > > > > > > > >> > > > >> better
> > > > > > > > > > > > > > > > > > > > >> > > > >> > throughput and using
> > > > > conditional
> > > > > > > > writes
> > > > > > > > > > for
> > > > > > > > > > > > > > > > > correctness;
> > > > > > > > > > > > > > > > > > > (2)
> > > > > > > > > > > > > > > > > > > > >> > sending
> > > > > > > > > > > > > > > > > > > > >> > > > the
> > > > > > > > > > > > > > > > > > > > >> > > > >> > proper LeaderAndIsr and
> > > > > > > > UpdateMetadata
> > > > > > > > > > > > > requests.
> > > > > > > > > > > > > > > For
> > > > > > > > > > > > > > > > > > > example,
> > > > > > > > > > > > > > > > > > > > >> > during
> > > > > > > > > > > > > > > > > > > > >> > > > >> > controller failover,
> the
> > > full
> > > > > > > > metadata
> > > > > > > > > > > needs
> > > > > > > > > > > > to
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > sent
> > > > > > > > > > > > > > > > > > > while
> > > > > > > > > > > > > > > > > > > > >> > during
> > > > > > > > > > > > > > > > > > > > >> > > > >> > individual broker
> failure,
> > > only
> > > > > > > some
> > > > > > > > of
> > > > > > > > > > the
> > > > > > > > > > > > > > > metadata
> > > > > > > > > > > > > > > > > > needs
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > > > > > > >> > > > updated.
> > > > > > > > > > > > > > > > > > > > >> > > > >> > The controlled shutdown
> > > > > handling
> > > > > > > > > > sometimes
> > > > > > > > > > > > uses
> > > > > > > > > > > > > > > > > > > > >> StopReplicaRequest
> > > > > > > > > > > > > > > > > > > > >> > > > and
> > > > > > > > > > > > > > > > > > > > >> > > > >> > some other times uses
> > > > > > > > > > LeaderAndIsrRequest.
> > > > > > > > > > > > (3)
> > > > > > > > > > > > > > > > > triggering
> > > > > > > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > > > > >> > events
> > > > > > > > > > > > > > > > > > > > >> > > > >> based
> > > > > > > > > > > > > > > > > > > > >> > > > >> > on the responses of
> > > > > LeaderAndIsr
> > > > > > > > (e.g.
> > > > > > > > > > for
> > > > > > > > > > > > > topic
> > > > > > > > > > > > > > > > > > deletion).
> > > > > > > > > > > > > > > > > > > > >> Some
> > > > > > > > > > > > > > > > > > > > >> > of
> > > > > > > > > > > > > > > > > > > > >> > > > those
> > > > > > > > > > > > > > > > > > > > >> > > > >> > complexity could be
> > > > > re-implemented
> > > > > > > > in a
> > > > > > > > > > > more
> > > > > > > > > > > > > > > > efficient
> > > > > > > > > > > > > > > > > > way,
> > > > > > > > > > > > > > > > > > > > >> but we
> > > > > > > > > > > > > > > > > > > > >> > > > need
> > > > > > > > > > > > > > > > > > > > >> > > > >> to
> > > > > > > > > > > > > > > > > > > > >> > > > >> > be really careful not
> to
> > > > > generate
> > > > > > > > > > > regression.
> > > > > > > > > > > > > > Some
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > > >> > > > >> complexity
> > > > > > > > > > > > > > > > > > > > >> > > > >> > just won't go away.
> > > > > Reimplementing
> > > > > > > > all
> > > > > > > > > > > those
> > > > > > > > > > > > > > logic
> > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > 30
> > > > > > > > > > > > > > > > > > > > >> or
> > > > > > > > > > > > > > > > > > > > >> > so
> > > > > > > > > > > > > > > > > > > > >> > > > >> events
> > > > > > > > > > > > > > > > > > > > >> > > > >> > in the ZK-based
> controller
> > > is
> > > > > > > > possible,
> > > > > > > > > > but
> > > > > > > > > > > > > > seems a
> > > > > > > > > > > > > > > > bit
> > > > > > > > > > > > > > > > > > > > >> daunting
> > > > > > > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > > > > > > >> > > > >> risky.
> > > > > > > > > > > > > > > > > > > > >> > > > >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> Hi Jun,
> > > > > > > > > > > > > > > > > > > > >> > > > >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> Thanks for the
> response.  I
> > > agree
> > > > > > > that
> > > > > > > > > > there
> > > > > > > > > > > is
> > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > work
> > > > > > > > > > > > > > > > > > > here
> > > > > > > > > > > > > > > > > > > > >> but I
> > > > > > > > > > > > > > > > > > > > >> > > > don't
> > > > > > > > > > > > > > > > > > > > >> > > > >> think it's as bad as it
> might
> > > > > seem.
> > > > > > > > Most
> > > > > > > > > of
> > > > > > > > > > > the
> > > > > > > > > > > > > > code
> > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > writing
> > > > > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > > > > >> > ZK
> > > > > > > > > > > > > > > > > > > > >> > > > can
> > > > > > > > > > > > > > > > > > > > >> > > > >> be reused from the old
> > > > > controller. We
> > > > > > > > > > should
> > > > > > > > > > > at
> > > > > > > > > > > > > > least
> > > > > > > > > > > > > > > > > > > evaluate
> > > > > > > > > > > > > > > > > > > > >> using
> > > > > > > > > > > > > > > > > > > > >> > > the
> > > > > > > > > > > > > > > > > > > > >> > > > >> old
> ControllerChannelManager
> > > as
> > > > > > > well. I
> > > > > > > > > > don't
> > > > > > > > > > > > > know
> > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > we'll
> > > > > > > > > > > > > > > > > > > end
> > > > > > > > > > > > > > > > > > > > >> up
> > > > > > > > > > > > > > > > > > > > >> > > > doing it
> > > > > > > > > > > > > > > > > > > > >> > > > >> or not but it's a
> > > possibility.
> > > > > (The
> > > > > > > > main
> > > > > > > > > > > reason
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > do
> > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > >> that
> > > > > > > > > > > > > > > > > > > > >> > > the
> > > > > > > > > > > > > > > > > > > > >> > > > >> response handling will
> be a
> > > bit
> > > > > > > > > different)
> > > > > > > > > > > > > > > > > > > > >> > > > >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> The question of what to
> do
> > > > > during a
> > > > > > > > > > > controller
> > > > > > > > > > > > > > > failover
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > >> > > > interesting
> > > > > > > > > > > > > > > > > > > > >> > > > >> one. Technically, we
> should
> > > be
> > > > > able
> > > > > > > to
> > > > > > > > > > > continue
> > > > > > > > > > > > > > > sending
> > > > > > > > > > > > > > > > > > > > >> incremental
> > > > > > > > > > > > > > > > > > > > >> > > > updates
> > > > > > > > > > > > > > > > > > > > >> > > > >> to the legacy nodes, for
> the
> > > same
> > > > > > > > reason
> > > > > > > > > we
> > > > > > > > > > > can
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > mode.
> > > > > > > > > > > > > > > > > > > > >> > > However,
> > > > > > > > > > > > > > > > > > > > >> > > > >> probably we should just
> copy
> > > > > what ZK
> > > > > > > > mode
> > > > > > > > > > > does
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > send
> > > > > > > > > > > > > > > > > > them
> > > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > >> full
> > > > > > > > > > > > > > > > > > > > >> > > > >> metadata update. This
> will
> > > allow
> > > > > us
> > > > > > > to
> > > > > > > > > have
> > > > > > > > > > > an
> > > > > > > > > > > > > easy
> > > > > > > > > > > > > > > way
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > handle
> > > > > > > > > > > > > > > > > > > > >> > > > >> divergences caused by
> lost
> > > RPCs
> > > > > by
> > > > > > > > > changing
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > > > (just
> > > > > > > > > > > > > > > > > > > > >> as
> > > > > > > > > > > > > > > > > > > > >> > we
> > > > > > > > > > > > > > > > > > > > >> > > > do
> > > > > > > > > > > > > > > > > > > > >> > > > >> in ZK mode,
> unfortunately). I
> > > > > suppose
> > > > > > > > we
> > > > > > > > > > > should
> > > > > > > > > > > > > > > > document
> > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > > > >> > > > KIP...
> > > > > > > > > > > > > > > > > > > > >> > > > >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> I agree controlled
> shutdown
> > > is
> > > > > tricky
> > > > > > > > to
> > > > > > > > > > get
> > > > > > > > > > > > just
> > > > > > > > > > > > > > > > right.
> > > > > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > > > > > >> suppose
> > > > > > > > > > > > > > > > > > > > >> > > this
> > > > > > > > > > > > > > > > > > > > >> > > > is
> > > > > > > > > > > > > > > > > > > > >> > > > >> a case where the RPCs we
> send
> > > > > out are
> > > > > > > > not
> > > > > > > > > > > > purely
> > > > > > > > > > > > > > > "fire
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > >> forget";
> > > > > > > > > > > > > > > > > > > > >> > we
> > > > > > > > > > > > > > > > > > > > >> > > > have
> > > > > > > > > > > > > > > > > > > > >> > > > >> to listen for the
> response.
> > > But
> > > > > that
> > > > > > > > can
> > > > > > > > > be
> > > > > > > > > > > > done
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > >> event-based
> > > > > > > > > > > > > > > > > > > > >> > > way.
> > > > > > > > > > > > > > > > > > > > >> > > > >> Controlled shutdown is
> also
> > > > > probably
> > > > > > > > the
> > > > > > > > > > last
> > > > > > > > > > > > > thing
> > > > > > > > > > > > > > > > we'll
> > > > > > > > > > > > > > > > > > > > >> implement
> > > > > > > > > > > > > > > > > > > > >> > > > once we
> > > > > > > > > > > > > > > > > > > > >> > > > >> have the basic lift and
> > > shift.
> > > > > > > > > > > > > > > > > > > > >> > > > >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> Topic deletion is another
> > > > > annoying
> > > > > > > > > thing. I
> > > > > > > > > > > > > wonder
> > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > > >> just
> > > > > > > > > > > > > > > > > > > > >> > > > delete
> > > > > > > > > > > > > > > > > > > > >> > > > >> immediately here and
> skip the
> > > > > > > > complexity
> > > > > > > > > of
> > > > > > > > > > > > > > > > implementing
> > > > > > > > > > > > > > > > > > > > >> "deleting
> > > > > > > > > > > > > > > > > > > > >> > > > state."
> > > > > > > > > > > > > > > > > > > > >> > > > >> Topic IDs will exist in
> IBP
> > > 3.4,
> > > > > in
> > > > > > > > both
> > > > > > > > > ZK
> > > > > > > > > > > and
> > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > mode,
> > > > > > > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > >> > > > should be
> > > > > > > > > > > > > > > > > > > > >> > > > >> OK, right?
> > > > > > > > > > > > > > > > > > > > >> > > > >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> best,
> > > > > > > > > > > > > > > > > > > > >> > > > >> Colin
> > > > > > > > > > > > > > > > > > > > >> > > > >>
> > > > > > > > > > > > > > > > > > > > >> > > > >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> > Thanks,
> > > > > > > > > > > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> > Jun
> > > > > > > > > > > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> > On Fri, Oct 14, 2022 at
> > > 9:29 AM
> > > > > > > Colin
> > > > > > > > > > > McCabe
> > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > >> cmcc...@apache.org>
> > > > > > > > > > > > > > > > > > > > >> > > > wrote:
> > > > > > > > > > > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> On Thu, Oct 13, 2022,
> at
> > > > > 11:44,
> > > > > > > Jun
> > > > > > > > > Rao
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > Hi, Colin,
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > Thanks for the
> reply.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > 10. This is a bit
> on the
> > > > > > > > > > implementation
> > > > > > > > > > > > > side.
> > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > you
> > > > > > > > > > > > > > > > > > > look
> > > > > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > > >> > > > >> existing
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > ZK-based controller,
> > > most
> > > > > of the
> > > > > > > > > logic
> > > > > > > > > > > is
> > > > > > > > > > > > > > around
> > > > > > > > > > > > > > > > > > > > >> maintaining an
> > > > > > > > > > > > > > > > > > > > >> > > > >> in-memory
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > state of all the
> > > resources
> > > > > > > > (broker,
> > > > > > > > > > > topic,
> > > > > > > > > > > > > > > > > partition,
> > > > > > > > > > > > > > > > > > > > etc),
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> reading/writing
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > to ZK, sending
> > > LeaderAndIsr
> > > > > and
> > > > > > > > > > > > > UpdateMetadata
> > > > > > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > >> > > > handling
> > > > > > > > > > > > > > > > > > > > >> > > > >> the
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > responses to
> brokers.
> > > So we
> > > > > need
> > > > > > > > all
> > > > > > > > > > > that
> > > > > > > > > > > > > > logic
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > dual
> > > > > > > > > > > > > > > > > > > > >> > write
> > > > > > > > > > > > > > > > > > > > >> > > > >> mode.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> One
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > option is to
> duplicate
> > > all
> > > > > that
> > > > > > > > > logic
> > > > > > > > > > in
> > > > > > > > > > > > > some
> > > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > > code.
> > > > > > > > > > > > > > > > > > > > This
> > > > > > > > > > > > > > > > > > > > >> > can
> > > > > > > > > > > > > > > > > > > > >> > > > be a
> > > > > > > > > > > > > > > > > > > > >> > > > >> bit
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > error prone and
> makes
> > > the
> > > > > code a
> > > > > > > > bit
> > > > > > > > > > > > harder
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > maintain
> > > > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > > >> we
> > > > > > > > > > > > > > > > > > > > >> > > need
> > > > > > > > > > > > > > > > > > > > >> > > > to
> > > > > > > > > > > > > > > > > > > > >> > > > >> fix
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > some critical
> issues in
> > > > > ZK-based
> > > > > > > > > > > > > controllers.
> > > > > > > > > > > > > > > > > Another
> > > > > > > > > > > > > > > > > > > > >> option is
> > > > > > > > > > > > > > > > > > > > >> > > to
> > > > > > > > > > > > > > > > > > > > >> > > > try
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > reusing the existing
> > > code
> > > > > in the
> > > > > > > > > > > ZK-based
> > > > > > > > > > > > > > > > > controller.
> > > > > > > > > > > > > > > > > > > For
> > > > > > > > > > > > > > > > > > > > >> > > example,
> > > > > > > > > > > > > > > > > > > > >> > > > we
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> could
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > start the
> EventManager
> > > in
> > > > > the
> > > > > > > > > ZK-based
> > > > > > > > > > > > > > > controller,
> > > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > > let
> > > > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > > > >> > > > KRaft
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > controller ingest
> new
> > > > > events.
> > > > > > > This
> > > > > > > > > has
> > > > > > > > > > > its
> > > > > > > > > > > > > own
> > > > > > > > > > > > > > > > > > > challenges:
> > > > > > > > > > > > > > > > > > > > >> (1)
> > > > > > > > > > > > > > > > > > > > >> > > the
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> existing
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > logic only logs ZK
> > > failures
> > > > > and
> > > > > > > > > > doesn't
> > > > > > > > > > > > > expose
> > > > > > > > > > > > > > > > them
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > > caller;
> > > > > > > > > > > > > > > > > > > > >> > > > (2)
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> the
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > existing logic may
> add
> > > new
> > > > > > > events
> > > > > > > > to
> > > > > > > > > > the
> > > > > > > > > > > > > queue
> > > > > > > > > > > > > > > > > itself
> > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > >> > > > probably
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> need
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > to think through how
> > > this is
> > > > > > > > > > coordinated
> > > > > > > > > > > > > with
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > > >> > > controller;
> > > > > > > > > > > > > > > > > > > > >> > > > >> (3)
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> it
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > registers some ZK
> > > listeners
> > > > > > > > > > > unnecessarily
> > > > > > > > > > > > > (may
> > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > big
> > > > > > > > > > > > > > > > > > > > >> > > > concern).
> > > > > > > > > > > > > > > > > > > > >> > > > >> So
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> we
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > need to get around
> those
> > > > > issues
> > > > > > > > > > > somehow. I
> > > > > > > > > > > > > am
> > > > > > > > > > > > > > > > > > wondering
> > > > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > > >> we
> > > > > > > > > > > > > > > > > > > > >> > > have
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > considered both
> options
> > > and
> > > > > > > which
> > > > > > > > > > > approach
> > > > > > > > > > > > > we
> > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > leaning
> > > > > > > > > > > > > > > > > > > > >> > towards
> > > > > > > > > > > > > > > > > > > > >> > > > for
> > > > > > > > > > > > > > > > > > > > >> > > > >> the
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > implementation.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> Yes, this is a good
> > > question.
> > > > > My
> > > > > > > > take
> > > > > > > > > is
> > > > > > > > > > > > that
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > big
> > > > > > > > > > > > > > > > > part
> > > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > > > >> > > > >> complexity
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> in the old controller
> code
> > > > > results
> > > > > > > > > from
> > > > > > > > > > > the
> > > > > > > > > > > > > fact
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > > >> ZK
> > > > > > > > > > > > > > > > > > > > >> > as
> > > > > > > > > > > > > > > > > > > > >> > > a
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> multi-writer database
> for
> > > > > > > > propagating
> > > > > > > > > > > > > > information
> > > > > > > > > > > > > > > > > > between
> > > > > > > > > > > > > > > > > > > > >> > different
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> components. So in the
> old
> > > > > > > > controller,
> > > > > > > > > > > every
> > > > > > > > > > > > > > write
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > > > needs
> > > > > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > > > > >> > be
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> structured as a
> compare
> > > and
> > > > > swap
> > > > > > > to
> > > > > > > > be
> > > > > > > > > > > fully
> > > > > > > > > > > > > > > > correct.
> > > > > > > > > > > > > > > > > > > Every
> > > > > > > > > > > > > > > > > > > > >> time
> > > > > > > > > > > > > > > > > > > > >> > we
> > > > > > > > > > > > > > > > > > > > >> > > > get
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> notified about
> something,
> > > it's
> > > > > > > > usually
> > > > > > > > > > in
> > > > > > > > > > > > the
> > > > > > > > > > > > > > form
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > "this
> > > > > > > > > > > > > > > > > > > > >> znode
> > > > > > > > > > > > > > > > > > > > >> > > > >> changed"
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> which prompts a full
> > > reload of
> > > > > > > part
> > > > > > > > of
> > > > > > > > > > the
> > > > > > > > > > > > > data
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > > (which
> > > > > > > > > > > > > > > > > > > > >> > itself
> > > > > > > > > > > > > > > > > > > > >> > > > has
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> multiple parts,
> loading,
> > > > > > > > > deserializing,
> > > > > > > > > > > > > > > reconciling,
> > > > > > > > > > > > > > > > > > etc.)
> > > > > > > > > > > > > > > > > > > > >> That
> > > > > > > > > > > > > > > > > > > > >> > all
> > > > > > > > > > > > > > > > > > > > >> > > > goes
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> away in the new mode,
> and
> > > we
> > > > > just
> > > > > > > > have
> > > > > > > > > > > some
> > > > > > > > > > > > > code
> > > > > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > > > > > >> analyzes
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> __cluster_metadata and
> > > > > reflects it
> > > > > > > > in
> > > > > > > > > 1)
> > > > > > > > > > > > > updates
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > 2)
> > > > > > > > > > > > > > > > > > > > >> > > > messages
> > > > > > > > > > > > > > > > > > > > >> > > > >> sent
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> out to brokers.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> This is pretty
> decoupled
> > > from
> > > > > the
> > > > > > > > > other
> > > > > > > > > > > > logic
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > >> QuorumController
> > > > > > > > > > > > > > > > > > > > >> > > and
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> should be easy to unit
> > > test,
> > > > > since
> > > > > > > > the
> > > > > > > > > > > same
> > > > > > > > > > > > > > inputs
> > > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> log
> > > > > > > > > > > > > > > > > > > > >> > > > always
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> produce the same
> output
> > > in ZK.
> > > > > > > > > > Basically,
> > > > > > > > > > > ZK
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > write-only
> > > > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > >> > us,
> > > > > > > > > > > > > > > > > > > > >> > > > we do
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> not read it (with the
> big
> > > > > > > exception
> > > > > > > > of
> > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > > registration
> > > > > > > > > > > > > > > > > > > > >> > znodes)
> > > > > > > > > > > > > > > > > > > > >> > > > and I
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> think that will
> greatly
> > > > > simplify
> > > > > > > > > things.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> So I think dual-write
> > > mode as
> > > > > > > > > described
> > > > > > > > > > > here
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > >> > substantially
> > > > > > > > > > > > > > > > > > > > >> > > > >> simpler
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> than trying to run
> part or
> > > > > all of
> > > > > > > > the
> > > > > > > > > > old
> > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > >> > parallel. I
> > > > > > > > > > > > > > > > > > > > >> > > > do
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> think we will reuse a
> > > bunch
> > > > > of the
> > > > > > > > > > > > > > serialization /
> > > > > > > > > > > > > > > > > > > > >> > deserialization
> > > > > > > > > > > > > > > > > > > > >> > > > code
> > > > > > > > > > > > > > > > > > > > >> > > > >> for
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> znodes and possibly
> the
> > > code
> > > > > for
> > > > > > > > > > > > communicating
> > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > ZK.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> best,
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> Colin
> > > > > > > > > > > > > > > > > > > > >> > > > >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > 14. Good point and
> make
> > > > > sense.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > Thanks,
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > Jun
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> > On Wed, Oct 12,
> 2022 at
> > > > > 3:27 PM
> > > > > > > > > Colin
> > > > > > > > > > > > > McCabe <
> > > > > > > > > > > > > > > > > > > > >> > cmcc...@apache.org
> > > > > > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > > > > > >> > > > >> wrote:
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> Hi Jun,
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> Thanks for taking a
> > > look.
> > > > > I can
> > > > > > > > > > answer
> > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > questions
> > > > > > > > > > > > > > > > > > > > here
> > > > > > > > > > > > > > > > > > > > >> > > > because I
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> collaborated on
> this a
> > > > > bit, and
> > > > > > > > > David
> > > > > > > > > > > is
> > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > vacation
> > > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > >> few
> > > > > > > > > > > > > > > > > > > > >> > > > days.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> On Wed, Oct 12,
> 2022,
> > > at
> > > > > 14:41,
> > > > > > > > Jun
> > > > > > > > > > Rao
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > Hi, David,
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > Thanks for the
> KIP.
> > > A few
> > > > > > > > > comments
> > > > > > > > > > > > below.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > 10. It's still
> not
> > > very
> > > > > clear
> > > > > > > > to
> > > > > > > > > me
> > > > > > > > > > > how
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > > >> controller
> > > > > > > > > > > > > > > > > > > > >> > > > works
> > > > > > > > > > > > > > > > > > > > >> > > > >> in
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> the
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > dual writes mode
> to
> > > > > KRaft log
> > > > > > > > and
> > > > > > > > > > ZK
> > > > > > > > > > > > when
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > brokers
> > > > > > > > > > > > > > > > > > > > >> still
> > > > > > > > > > > > > > > > > > > > >> > > run
> > > > > > > > > > > > > > > > > > > > >> > > > in
> > > > > > > > > > > > > > > > > > > > >> > > > >> ZK
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> mode.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > Does the KRaft
> > > controller
> > > > > > > run a
> > > > > > > > > ZK
> > > > > > > > > > > > based
> > > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > >> > > parallel
> > > > > > > > > > > > > > > > > > > > >> > > > or
> > > > > > > > > > > > > > > > > > > > >> > > > >> do
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> we
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > derive what
> needs to
> > > be
> > > > > > > written
> > > > > > > > > to
> > > > > > > > > > ZK
> > > > > > > > > > > > > based
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > > >> > > controller
> > > > > > > > > > > > > > > > > > > > >> > > > >> logic?
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> We derive what
> needs
> > > to be
> > > > > > > > written
> > > > > > > > > to
> > > > > > > > > > > ZK
> > > > > > > > > > > > > > based
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > > >> > > controller
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> logic.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > I am also not
> sure
> > > how
> > > > > the
> > > > > > > > KRaft
> > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > handles
> > > > > > > > > > > > > > > > > > > > >> broker
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > registration/deregistration,
> > > > > > > > > since
> > > > > > > > > > > > > brokers
> > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > still
> > > > > > > > > > > > > > > > > > > > >> running
> > > > > > > > > > > > > > > > > > > > >> > > in
> > > > > > > > > > > > > > > > > > > > >> > > > ZK
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> mode
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> and
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > are not
> heartbeating
> > > to
> > > > > the
> > > > > > > > KRaft
> > > > > > > > > > > > > > controller.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> The new controller
> will
> > > > > listen
> > > > > > > > for
> > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > > registrations
> > > > > > > > > > > > > > > > > > > > >> under
> > > > > > > > > > > > > > > > > > > > >> > > > >> /brokers.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> This is the only
> znode
> > > > > watch
> > > > > > > that
> > > > > > > > > the
> > > > > > > > > > > new
> > > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > > >> do.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> We did consider
> > > changing
> > > > > how
> > > > > > > > > ZK-based
> > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > > > registration
> > > > > > > > > > > > > > > > > > > > >> > > worked,
> > > > > > > > > > > > > > > > > > > > >> > > > >> but it
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> just ended up
> being too
> > > > > much
> > > > > > > work
> > > > > > > > > for
> > > > > > > > > > > not
> > > > > > > > > > > > > > > enough
> > > > > > > > > > > > > > > > > > gain.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > 12. "A new set of
> > > nodes
> > > > > will
> > > > > > > be
> > > > > > > > > > > > > provisioned
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > host
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > > > controller
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> quorum."
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > I guess we don't
> > > support
> > > > > > > > starting
> > > > > > > > > > the
> > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > > > >> > quorum
> > > > > > > > > > > > > > > > > > > > >> > > on
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> existing
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > brokers. It
> would be
> > > > > useful
> > > > > > > to
> > > > > > > > > make
> > > > > > > > > > > > that
> > > > > > > > > > > > > > > clear.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> Agreed
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > 13. "Once the
> quorum
> > > is
> > > > > > > > > established
> > > > > > > > > > > > and a
> > > > > > > > > > > > > > > > leader
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > >> elected,
> > > > > > > > > > > > > > > > > > > > >> > > the
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> controller
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > will check the
> state
> > > of
> > > > > the
> > > > > > > > > cluster
> > > > > > > > > > > > using
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> MigrationCheck
> > > > > > > > > > > > > > > > > > > > >> > > > RPC."
> > > > > > > > > > > > > > > > > > > > >> > > > >> How
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> does
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > the quorum
> controller
> > > > > detect
> > > > > > > > > other
> > > > > > > > > > > > > brokers?
> > > > > > > > > > > > > > > > Does
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > > controller
> > > > > > > > > > > > > > > > > > > > >> > > > >> node
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> need
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > to be configured
> > > with ZK
> > > > > > > > > connection
> > > > > > > > > > > > > string?
> > > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > > so,
> > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > >> would
> > > > > > > > > > > > > > > > > > > > >> > be
> > > > > > > > > > > > > > > > > > > > >> > > > >> useful
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> to
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > document the
> > > additional
> > > > > > > configs
> > > > > > > > > > that
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > quorum
> > > > > > > > > > > > > > > > > > > > >> controller
> > > > > > > > > > > > > > > > > > > > >> > > > needs to
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> set.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> Yes, the
> controllers
> > > > > monitor ZK
> > > > > > > > for
> > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > > > registrations,
> > > > > > > > > > > > > > > > > > > > >> as I
> > > > > > > > > > > > > > > > > > > > >> > > > >> mentioned
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> above. So they need
> > > > > zk.connect
> > > > > > > > and
> > > > > > > > > > the
> > > > > > > > > > > > > other
> > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > > > connection
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> configurations.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > 14. "In order to
> > > prevent
> > > > > > > > further
> > > > > > > > > > > writes
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > ZK,
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > first
> > > > > > > > > > > > > > > > > > > > >> > thing
> > > > > > > > > > > > > > > > > > > > >> > > > the
> > > > > > > > > > > > > > > > > > > > >> > > > >> new
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > KRaft quorum
> must do
> > > is
> > > > > take
> > > > > > > > over
> > > > > > > > > > > > > > leadership
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > > > >> > > > controller.
> > > > > > > > > > > > > > > > > > > > >> > > > >> "
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> The
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> ZK
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > controller
> processing
> > > > > changes
> > > > > > > > to
> > > > > > > > > > > > > > /controller
> > > > > > > > > > > > > > > > > update
> > > > > > > > > > > > > > > > > > > > >> > > > asynchronously.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> How
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > does the KRaft
> > > controller
> > > > > > > know
> > > > > > > > > when
> > > > > > > > > > > the
> > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > > > has
> > > > > > > > > > > > > > > > > > > > >> > > > resigned
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> before
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > it can safely
> copy
> > > the ZK
> > > > > > > data?
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> This should be done
> > > through
> > > > > > > > > > > > > > > > > > > > >> expectedControllerEpochZkVersion,
> > > > > > > > > > > > > > > > > > > > >> > > just
> > > > > > > > > > > > > > > > > > > > >> > > > >> like
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> in
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> ZK mode, right? We
> > > should
> > > > > bump
> > > > > > > > this
> > > > > > > > > > > epoch
> > > > > > > > > > > > > > value
> > > > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > > > >> > > writes
> > > > > > > > > > > > > > > > > > > > >> > > > >> from
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> the
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> old controller will
> > > not go
> > > > > > > > > through. I
> > > > > > > > > > > > agree
> > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > > > > spell
> > > > > > > > > > > > > > > > > > > > >> > this
> > > > > > > > > > > > > > > > > > > > >> > > > out
> > > > > > > > > > > > > > > > > > > > >> > > > >> in
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> the
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> KIP.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > 15. We have the
> > > following
> > > > > > > > > > sentences.
> > > > > > > > > > > > One
> > > > > > > > > > > > > > says
> > > > > > > > > > > > > > > > > > > > >> ControllerId
> > > > > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > > > > >> > > a
> > > > > > > > > > > > > > > > > > > > >> > > > >> random
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > KRaft broker and
> the
> > > > > other
> > > > > > > says
> > > > > > > > > > it's
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > active
> > > > > > > > > > > > > > > > > > > > >> controller.
> > > > > > > > > > > > > > > > > > > > >> > > > Which
> > > > > > > > > > > > > > > > > > > > >> > > > >> one
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> is
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > correct?
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > "UpdateMetadata:
> for
> > > > > certain
> > > > > > > > > > metadata
> > > > > > > > > > > > > > > changes,
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > > >> > > > controller
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> will
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > need to send
> > > > > > > > > UpdateMetadataRequests
> > > > > > > > > > > to
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > > brokers.
> > > > > > > > > > > > > > > > > > > > >> For
> > > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > “ControllerId”
> field
> > > in
> > > > > this
> > > > > > > > > > request,
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > > > >> should
> > > > > > > > > > > > > > > > > > > > >> > > > >> specify a
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > random KRaft
> broker."
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > "In the
> > > > > UpdateMetadataRequest
> > > > > > > > > sent
> > > > > > > > > > by
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > > >> controller
> > > > > > > > > > > > > > > > > > > > >> > to
> > > > > > > > > > > > > > > > > > > > >> > > > the
> > > > > > > > > > > > > > > > > > > > >> > > > >> ZK
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > brokers, the
> > > ControllerId
> > > > > > > will
> > > > > > > > > > point
> > > > > > > > > > > to
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > active
> > > > > > > > > > > > > > > > > > > > >> > controller
> > > > > > > > > > > > > > > > > > > > >> > > > which
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> will
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> be
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > used for the
> > > inter-broker
> > > > > > > > > > requests."
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> Yeah, this seems
> like
> > > an
> > > > > error
> > > > > > > to
> > > > > > > > > me
> > > > > > > > > > as
> > > > > > > > > > > > > > well. A
> > > > > > > > > > > > > > > > > > random
> > > > > > > > > > > > > > > > > > > > >> value
> > > > > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > > > > >> > > > not
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> really
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> useful. Plus the
> text
> > > here
> > > > > is
> > > > > > > > > > > > > > > self-contradictory,
> > > > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > > you
> > > > > > > > > > > > > > > > > > > > >> > pointed
> > > > > > > > > > > > > > > > > > > > >> > > > out.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> I suspect what we
> > > should do
> > > > > > > here
> > > > > > > > is
> > > > > > > > > > > add a
> > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > field,
> > > > > > > > > > > > > > > > > > > > >> > > > >> KRaftControllerId,
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> and populate it
> with
> > > the
> > > > > real
> > > > > > > > > > > controller
> > > > > > > > > > > > > ID,
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > leave
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> old
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> controllerId
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> field as -1. A
> ZK-based
> > > > > broker
> > > > > > > > that
> > > > > > > > > > > sees
> > > > > > > > > > > > > this
> > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > then
> > > > > > > > > > > > > > > > > > > > >> consult
> > > > > > > > > > > > > > > > > > > > >> > > its
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > controller.quorum.voters
> > > > > > > > > > configuration
> > > > > > > > > > > to
> > > > > > > > > > > > > see
> > > > > > > > > > > > > > > > where
> > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > >> should
> > > > > > > > > > > > > > > > > > > > >> > > send
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> controller-bound
> RPCs.
> > > That
> > > > > > > > > (static)
> > > > > > > > > > > > > > > > configuration
> > > > > > > > > > > > > > > > > > lets
> > > > > > > > > > > > > > > > > > > > us
> > > > > > > > > > > > > > > > > > > > >> map
> > > > > > > > > > > > > > > > > > > > >> > > > >> between
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> controller ID and
> > > > > host:port.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> We should still
> keep
> > > our
> > > > > > > existing
> > > > > > > > > > epoch
> > > > > > > > > > > > > logic
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > deciding
> > > > > > > > > > > > > > > > > > > > >> > when
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> UpdateMetadataRequest /
> > > > > > > > > > > > > LeaderAndIsrRequests
> > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > stale,
> > > > > > > > > > > > > > > > > > > > >> with
> > > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > > >> > > > >> caveat
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> that any
> kraft-based
> > > epoch
> > > > > > > should
> > > > > > > > > be
> > > > > > > > > > > > > treated
> > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > greater
> > > > > > > > > > > > > > > > > > > > >> than
> > > > > > > > > > > > > > > > > > > > >> > any
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> ZK-based
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> epoch. After all,
> the
> > > kraft
> > > > > > > epoch
> > > > > > > > > is
> > > > > > > > > > > > coming
> > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > epoch
> > > > > > > > > > > > > > > > > > > > >> of
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> __cluster_metadata,
> > > > > whereas the
> > > > > > > > ZK
> > > > > > > > > > > epoch
> > > > > > > > > > > > > > comes
> > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > ZK.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > 16.
> "Additionally,
> > > the
> > > > > > > > controller
> > > > > > > > > > > must
> > > > > > > > > > > > > > > specify
> > > > > > > > > > > > > > > > > if a
> > > > > > > > > > > > > > > > > > > > >> broker
> > > > > > > > > > > > > > > > > > > > >> > in
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> “LiveBrokers”
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > is KRaft or ZK."
> Does
> > > > > that
> > > > > > > > > require
> > > > > > > > > > > any
> > > > > > > > > > > > > > > protocol
> > > > > > > > > > > > > > > > > > > changes
> > > > > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> UpdateMetadata?
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> Yeah, I am also
> curious
> > > > > why the
> > > > > > > > we
> > > > > > > > > > need
> > > > > > > > > > > > to
> > > > > > > > > > > > > > care
> > > > > > > > > > > > > > > > > > whether
> > > > > > > > > > > > > > > > > > > > >> > brokers
> > > > > > > > > > > > > > > > > > > > >> > > > are
> > > > > > > > > > > > > > > > > > > > >> > > > >> ZK
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> or
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> KRaft in
> > > > > UpdateMetadataRequest.
> > > > > > > > We
> > > > > > > > > > > don't
> > > > > > > > > > > > > > reveal
> > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> > clients,
> > > > > > > > > > > > > > > > > > > > >> > > > so
> > > > > > > > > > > > > > > > > > > > >> > > > >> can
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> we
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> just leave this
> out?
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> best,
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> Colin
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > Thanks,
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > Jun
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > On Wed, Oct 5,
> 2022
> > > at
> > > > > 10:07
> > > > > > > AM
> > > > > > > > > > > Mickael
> > > > > > > > > > > > > > > Maison
> > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > >> > > > >> >>
> mickael.mai...@gmail.com
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > wrote:
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> Hi David,
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> Thanks for
> starting
> > > this
> > > > > > > > > important
> > > > > > > > > > > > KIP.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> I've just taken
> a
> > > quick
> > > > > look
> > > > > > > > so
> > > > > > > > > > far
> > > > > > > > > > > > but
> > > > > > > > > > > > > > I've
> > > > > > > > > > > > > > > > > got a
> > > > > > > > > > > > > > > > > > > > >> couple
> > > > > > > > > > > > > > > > > > > > >> > of
> > > > > > > > > > > > > > > > > > > > >> > > > >> initial
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> questions:
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> 1) What happens
> if
> > > a non
> > > > > > > KRaft
> > > > > > > > > > > > > compatible
> > > > > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > > (or
> > > > > > > > > > > > > > > > > > > > >> with
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >>
> > > > > > > > kafka.metadata.migration.enable
> > > > > > > > > > set
> > > > > > > > > > > to
> > > > > > > > > > > > > > > false)
> > > > > > > > > > > > > > > > > > joins
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > > cluster
> > > > > > > > > > > > > > > > > > > > >> > > > >> after
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> the migration is
> > > > > triggered?
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> 2) In the
> Failure
> > > Modes
> > > > > > > > section
> > > > > > > > > > you
> > > > > > > > > > > > > > mention
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > scenario
> > > > > > > > > > > > > > > > > > > > >> > where
> > > > > > > > > > > > > > > > > > > > >> > > a
> > > > > > > > > > > > > > > > > > > > >> > > > >> write
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> to ZK fails.
> What
> > > > > happens
> > > > > > > when
> > > > > > > > > the
> > > > > > > > > > > > > > > divergence
> > > > > > > > > > > > > > > > > > limit
> > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > >> > > > reached? Is
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> this a fatal
> > > condition?
> > > > > How
> > > > > > > > much
> > > > > > > > > > > > > > divergence
> > > > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > >> > allow?
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> Thanks,
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> Mickael
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> On Wed, Oct 5,
> 2022
> > > at
> > > > > 12:20
> > > > > > > > AM
> > > > > > > > > > > David
> > > > > > > > > > > > > > > Arthur <
> > > > > > > > > > > > > > > > > > > > >> > > mum...@gmail.com
> > > > > > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> wrote:
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> > Hey folks, I
> > > wanted
> > > > > to get
> > > > > > > > the
> > > > > > > > > > > ball
> > > > > > > > > > > > > > > rolling
> > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > > discussion
> > > > > > > > > > > > > > > > > > > > >> > > > >> for
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> the
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> > ZooKeeper
> > > migration
> > > > > KIP.
> > > > > > > > This
> > > > > > > > > > KIP
> > > > > > > > > > > > > > details
> > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > plan
> > > > > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > > > > >> > do
> > > > > > > > > > > > > > > > > > > > >> > > > an
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> online
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> > migration of
> > > metadata
> > > > > from
> > > > > > > > > > > ZooKeeper
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > > > well
> > > > > > > > > > > > > > > > > > > > >> as
> > > > > > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > > > > > >> > > > >> rolling
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> > upgrade of
> > > brokers to
> > > > > > > KRaft
> > > > > > > > > > mode.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> > The general
> idea
> > > is to
> > > > > > > keep
> > > > > > > > > > KRaft
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > ZooKeeper
> > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > >> sync
> > > > > > > > > > > > > > > > > > > > >> > > > during
> > > > > > > > > > > > > > > > > > > > >> > > > >> the
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> > migration, so
> both
> > > > > types
> > > > > > > of
> > > > > > > > > > > brokers
> > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > exist
> > > > > > > > > > > > > > > > > > > > >> > > simultaneously.
> > > > > > > > > > > > > > > > > > > > >> > > > >> Then,
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> > once
> everything is
> > > > > > > migrated
> > > > > > > > > and
> > > > > > > > > > > > > updated,
> > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > turn
> > > > > > > > > > > > > > > > > > > > >> off
> > > > > > > > > > > > > > > > > > > > >> > > > >> ZooKeeper
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> > writes.
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> > This is a
> pretty
> > > > > complex
> > > > > > > > KIP,
> > > > > > > > > so
> > > > > > > > > > > > > please
> > > > > > > > > > > > > > > > take a
> > > > > > > > > > > > > > > > > > > look
> > > > > > > > > > > > > > > > > > > > :)
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >>
> > > > > > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-866+ZooKeeper+to+KRaft+Migration
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> >
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> > Thanks!
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> > David
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >> >>
> > > > > > > > > > > > > > > > > > > > >> > > > >>
> > > > > > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > > > > --
> > > > > > > > > > > > > > > > > > > > >> > > > > -David
> > > > > > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > >> > > --
> > > > > > > > > > > > > > > > > > > > >> > > -David
> > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > > -David
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > -David
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > -David
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > -David
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > -David
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > -David
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > -David
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > -David
> > > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > David Arthur
> > > > >
> > >
> > >
> > >
> > > --
> > > David Arthur
> > >
>
>
>
> --
> David Arthur
>

Reply via email to