On Wed, Nov 17, 2021, at 12:28, David Arthur wrote:
> Colin,
>
> I wasn't intending to suggest that IBP and metadata.version co-exist long
> term in KRaft. I was thinking we would have one final IBP that would signal
> KRaft clusters to start looking at metadata.version instead. This
> was working under the assumption that we should be able to upgrade 3.0
> clusters to a version with metadata.version. I don't think it would be too
> difficult since we could treat the absence of a feature flag as an implicit
> version 0 which would be compatible with 3.0.
>

Hi David,

Hmm... do we really have to have "one final IBP" for upgraded 3.0 / 3.1 KRaft 
clusters? It seems like we could just initialize their cluster.metadata to 1 
when the software is upgraded to 3.2.

Admittedly this is a bit of a hack -- in general, we don't want to couple 
software upgrades and metadata version bumps. But 3.0 and 3.1 were "preview," 
not intended for production, and this is a pretty minor speed bump for someone 
testing out pre-production software.

This would greatly simplify the system by getting rid of IBP completely in 
KRaft clusters, by substituting it with metadata.version.

>
> Assuming this lands in 3.2, we would have
>
> Software    IBP    metadata.version    effective version
> 3.0         3.0    -                   0
> 3.2         3.0    -                   0
> 3.2         3.2    -                   0
> 3.2         3.2    1                   1
>
> where metadata.version=0 is compatible with what we have today in KRaft.
>
> If we don't support 3.0 -> 3.x KRaft upgrades, then early adopters may be
> rather inconvenienced :) I can't say for sure that the extra complexity is
> worth the convenience of allowing upgrades from the preview versions.
>

So let's drill down to what the inconvenience would be. It would basically be 
an unexpected (if they didn't read the upgrade docs) IBP bump to whatever the 
IBP is in 3.2. Assuming that they started on KAFKA_3_0_IV0 and got 
auto-upgraded to the latest, after the upgrade they would have these additional 
features:

> 1. Introduce ListOffsets V7 (KIP-724)
> 2. Add topic IDs to Fetch requests/responses (KIP-516)

That seems pretty minor (again, noting that this ONLY applies to KRaft 3.0 
Preview users, NOT to ZK users). I don't think it's worth keeping IBP around 
just to prevent this inconvenience.

We have a limited amount of time and effort available to manage versioning 
issues. I don't think it's realistic to have to think about the cross-product 
of all possible metadata.version and IBP values when making a change.

We also have to expect new features to keep going in while the transition is 
taking place. That's why I think having a 1:1 mapping between new IBP versions 
and metadata.versions is helpful.

best,
Colin

>
> We probably need to make a decision on this since it impacts a few things
> in the KIP. What do folks think?
>
> -David
>
>
> On Wed, Nov 17, 2021 at 2:28 PM Jason Gustafson <ja...@confluent.io.invalid>
> wrote:
>
>> Hi David,
>>
>> Forgive me if this ground has been covered already. Today, we have a few
>> other things that we have latched onto the IBP, such as upgrades to the
>> format of records in __consumer_offsets. I've been assuming that
>> metadata.version is not covering this. Is that right or is there some other
>> plan to take care of cases like this?
>>
>> Thanks,
>> Jason
>>
>>
>>
>> On Wed, Nov 17, 2021 at 10:17 AM Jun Rao <j...@confluent.io.invalid> wrote:
>>
>> > Hi, Colin,
>> >
>> > Thanks for the reply.
>> >
>> > For case b, I am not sure that I understand your suggestion. Does "each
>> > subsequent level for metadata.version corresponds to an IBP version" mean
>> > that we need to keep IBP forever? Could you describe the upgrade process
>> in
>> > this case?
>> >
>> > Jun
>> >
>> > On Tue, Nov 16, 2021 at 3:45 PM Colin McCabe <cmcc...@apache.org> wrote:
>> >
>> > > On Tue, Nov 16, 2021, at 15:13, Jun Rao wrote:
>> > > > Hi, David, Colin,
>> > > >
>> > > > Thanks for the reply.
>> > > >
>> > > > 16. Discussed with David offline a bit. We have 3 cases.
>> > > > a. We upgrade from an old version where the metadata.version has
>> > already
>> > > > been finalized. In this case it makes sense to stay with that feature
>> > > > version after the upgrade.
>> > >
>> > > +1
>> > >
>> > > > b. We upgrade from an old version where no metadata.version has been
>> > > > finalized. In this case, it makes sense to leave metadata.version
>> > > disabled
>> > > > since we don't know if all brokers have been upgraded.
>> > >
>> > > This is the scenario I was hoping to avoid by saying that ALL KRaft
>> > > clusters have metadata.version of at least 1, and each subsequent level
>> > for
>> > > metadata.version corresponds to an IBP version. The existing KRaft
>> > clusters
>> > > in 3.0 and earlier are preview (not for production) so I think this
>> > change
>> > > is OK for 3.x (given that it affects only KRaft). Then IBP is
>> irrelevant
>> > > for KRaft clusters (the config is ignored, possibly with a WARN or
>> ERROR
>> > > message generated if it is set).
>> > >
>> > > > c. We are starting from a brand new cluster and of course no
>> > > > metadata.version has been finalized. In this case, the KIP says it
>> will
>> > > > pick the metadata.version in meta.properties. In the common case,
>> > people
>> > > > probably won't set the metadata.version in the meta.properties file
>> > > > explicitly. So, it will be useful to put a default (stable) version
>> > there
>> > > > when the meta.properties.
>> > >
>> > > Hmm. I was assuming that clusters where the admin didn't specify any
>> > > metadata.version during formatting would get the latest
>> metadata.version.
>> > > Partly, because this is what we do for IBP today. It would be good to
>> > > clarify this...
>> > >
>> > > >
>> > > > Also, it would be useful to clarify that if a FeatureLevelRecord
>> exists
>> > > for
>> > > > metadata.version, the metadata.version in meta.properties will be
>> > > ignored.
>> > > >
>> > >
>> > > Yeah, I agree.
>> > >
>> > > best,
>> > > Colin
>> > >
>> > > > Thanks,
>> > > >
>> > > > Jun
>> > > >
>> > > >
>> > > > On Tue, Nov 16, 2021 at 12:39 PM Colin McCabe <cmcc...@apache.org>
>> > > wrote:
>> > > >
>> > > >> On Fri, Nov 5, 2021, at 15:18, Jun Rao wrote:
>> > > >> > Hi, David,
>> > > >> >
>> > > >> > Thanks for the reply.
>> > > >> >
>> > > >> > 16. My first concern is that the KIP picks up meta.version
>> > > inconsistently
>> > > >> > during the deployment. If a new cluster is started, we pick up the
>> > > >> highest
>> > > >> > version. If we upgrade, we leave the feature version unchanged.
>> > > >>
>> > > >> Hi Jun,
>> > > >>
>> > > >> Thanks again for taking a look.
>> > > >>
>> > > >> The proposed behavior in KIP-778 is consistent with how it works
>> > today.
>> > > >> Upgrading the software is distinct from upgrading the IBP.
>> > > >>
>> > > >> I think it is important to keep these two operations ("upgrading
>> > > >> IBP/metadata version" and "upgrading software version") separate. If
>> > > they
>> > > >> are coupled it will create a situation where software upgrades are
>> > > >> difficult and dangerous.
>> > > >>
>> > > >> Consider a situation where you find some bug in your current
>> software,
>> > > and
>> > > >> you want to upgrade to new software that fixes the bug. If upgrades
>> > and
>> > > IBP
>> > > >> bumps are coupled, you can't do this without also bumping the IBP,
>> > > which is
>> > > >> usually considered a high-risk change. That means that either you
>> have
>> > > to
>> > > >> make a special build that includes only the fix (time-consuming and
>> > > >> error-prone), live with the bug for longer, or be very conservative
>> > > about
>> > > >> ever introducing new IBP/metadata versions. None of those are really
>> > > good
>> > > >> choices.
>> > > >>
>> > > >> > Intuitively, it seems that independent of how a cluster is
>> deployed,
>> > > we
>> > > >> > should always pick the same feature version.
>> > > >>
>> > > >> I think it makes sense to draw a distinction between upgrading an
>> > > existing
>> > > >> cluster and deploying a new one. What most people want out of
>> upgrades
>> > > is
>> > > >> that things should keep working, but with bug fixes. If we change
>> > that,
>> > > it
>> > > >> just makes people more reluctant to upgrade (which is always a
>> > > problem...)
>> > > >>
>> > > >> > I think we need to think this through in this KIP. My second
>> concern
>> > > is
>> > > >> > that as a particular version matures, it's inconvenient for a user
>> > to
>> > > >> manually
>> > > >> > upgrade every feature version. As long as we have a path to
>> achieve
>> > > that
>> > > >> in
>> > > >> > the future, we don't need to address that in this KIP.
>> > > >>
>> > > >> If people are managing a large number of Kafka clusters, they will
>> > want
>> > > to
>> > > >> do some sort of A/B testing with IBP/metadata versions. So if you
>> have
>> > > 1000
>> > > >> Kafka clusters, you roll out the new IBP version to 10 of them and
>> see
>> > > how
>> > > >> it goes. If that goes well, you roll it out to more, etc.
>> > > >>
>> > > >> So, the automation needs to be at the cluster management layer, not
>> at
>> > > the
>> > > >> Kafka layer. Each Kafka cluster doesn't know how well things went in
>> > the
>> > > >> other 999 clusters. Automatically upgrading is a bad thing for the
>> > same
>> > > >> reason Kafka automatically upgrading its own software version would
>> > be a
>> > > >> bad thing -- it could lead to a disruption to a sensitive cluster at
>> > the
>> > > >> wrong time.
>> > > >>
>> > > >> For people who are just managing one or two Kafka clusters,
>> > > automatically
>> > > >> upgrading feature versions isn't a big burden and can be done
>> > manually.
>> > > >> This is all consistent with how IBP works today.
>> > > >>
>> > > >> Also, we already have a command-line option to the feature tool
>> which
>> > > >> upgrades all features to the latest available, if that is what the
>> > > >> administrator desires. Perhaps we could add documentation saying
>> that
>> > > this
>> > > >> should be done as the last step of the upgrade.
>> > > >>
>> > > >> best,
>> > > >> Colin
>> > > >>
>> > > >> >
>> > > >> > 21. "./kafka-features.sh delete": Deleting a feature seems a bit
>> > weird
>> > > >> > since the logic is always there. Would it be better to use
>> disable?
>> > > >> >
>> > > >> > Jun
>> > > >> >
>> > > >> > On Fri, Nov 5, 2021 at 8:11 AM David Arthur
>> > > >> > <david.art...@confluent.io.invalid> wrote:
>> > > >> >
>> > > >> >> Colin and Jun, thanks for the additional comments!
>> > > >> >>
>> > > >> >> Colin:
>> > > >> >>
>> > > >> >> > We've been talking about having an automated RPC compatibility
>> > > checker
>> > > >> >>
>> > > >> >> Do we have a way to mark fields in schemas as deprecated? It can
>> > > stay in
>> > > >> >> the RPC, it just complicates the logic a bit.
>> > > >> >>
>> > > >> >> > It would be nice if the active controller could validate that a
>> > > >> majority
>> > > >> >> of the quorum could use the proposed metadata.version. The active
>> > > >> >> controller should have this information, right? If we don't have
>> > > recent
>> > > >> >> information  from a quorum of voters, we wouldn't be active.
>> > > >> >>
>> > > >> >> I believe we should have this information from the
>> > > ApiVersionsResponse.
>> > > >> It
>> > > >> >> would be good to do this validation to avoid a situation where a
>> > > >> >> quorum leader can't be elected due to unprocessable records.
>> > > >> >>
>> > > >> >> > Do we need delete as a command separate from downgrade?
>> > > >> >>
>> > > >> >> I think from an operator's perspective, it is nice to distinguish
>> > > >> between
>> > > >> >> changing a feature flag and unsetting it. It might be surprising
>> to
>> > > an
>> > > >> >> operator to see the flag's version set to nothing when they
>> > requested
>> > > >> the
>> > > >> >> downgrade to version 0 (or less).
>> > > >> >>
>> > > >> >> > it seems like we should spell out that metadata.version begins
>> at
>> > > 1 in
>> > > >> >> KRaft clusters
>> > > >> >>
>> > > >> >> I added this text:
>> > > >> >>
>> > > >> >> Introduce an IBP version to indicate the lowest software version
>> > that
>> > > >> >> > supports *metadata.version*. Below this IBP, the
>> > > *metadata.version* is
>> > > >> >> > undefined and will not be examined. At or above this IBP, the
>> > > >> >> > *metadata.version* must be *0* for ZooKeeper clusters and will
>> be
>> > > >> >> > initialized as *1* for KRaft clusters.
>> > > >> >>
>> > > >> >>
>> > > >> >> > We probably also want an RPC implemented by both brokers and
>> > > >> controllers
>> > > >> >> that will reveal the min and max supported versions for each
>> > feature
>> > > >> level
>> > > >> >> supported by the server
>> > > >> >>
>> > > >> >> This is available in ApiVersionsResponse (we include the server's
>> > > >> supported
>> > > >> >> features as well as the cluster's finalized features)
>> > > >> >>
>> > > >> >> --------
>> > > >> >>
>> > > >> >> Jun:
>> > > >> >>
>> > > >> >> 12. I've updated the KIP with AdminClient changes
>> > > >> >>
>> > > >> >> 14. You're right, it looks like I missed a few sections regarding
>> > > >> snapshot
>> > > >> >> generation. I've corrected it
>> > > >> >>
>> > > >> >> 16. This feels more like an enhancement to KIP-584. I agree it
>> > could
>> > > be
>> > > >> >> useful, but perhaps we could address it separately from KRaft
>> > > upgrades?
>> > > >> >>
>> > > >> >> 20. Indeed snapshots are not strictly necessary during an
>> upgrade,
>> > > I've
>> > > >> >> reworded this
>> > > >> >>
>> > > >> >>
>> > > >> >> Thanks!
>> > > >> >> David
>> > > >> >>
>> > > >> >>
>> > > >> >> On Thu, Nov 4, 2021 at 6:51 PM Jun Rao <j...@confluent.io.invalid
>> >
>> > > >> wrote:
>> > > >> >>
>> > > >> >> > Hi, David, Jose and Colin,
>> > > >> >> >
>> > > >> >> > Thanks for the reply. A few more comments.
>> > > >> >> >
>> > > >> >> > 12. It seems that we haven't updated the AdminClient
>> accordingly?
>> > > >> >> >
>> > > >> >> > 14. "Metadata snapshot is generated and sent to the other
>> > inactive
>> > > >> >> > controllers and to brokers". I thought we wanted each broker to
>> > > >> generate
>> > > >> >> > its own snapshot independently? If only the controller
>> generates
>> > > the
>> > > >> >> > snapshot, how do we force other brokers to pick it up?
>> > > >> >> >
>> > > >> >> > 16. If a feature version is new, one may not want to enable it
>> > > >> >> immediately
>> > > >> >> > after the cluster is upgraded. However, if a feature version
>> has
>> > > been
>> > > >> >> > stable, requiring every user to run a command to upgrade to
>> that
>> > > >> version
>> > > >> >> > seems inconvenient. One way to improve this is for each feature
>> > to
>> > > >> define
>> > > >> >> > one version as the default. Then, when we upgrade a cluster, we
>> > > will
>> > > >> >> > automatically upgrade the feature to the default version. An
>> > admin
>> > > >> could
>> > > >> >> > use the tool to upgrade to a version higher than the default.
>> > > >> >> >
>> > > >> >> > 20. "The quorum controller can assist with this process by
>> > > generating
>> > > >> a
>> > > >> >> > metadata snapshot after a metadata.version increase has been
>> > > >> committed to
>> > > >> >> > the metadata log. This snapshot will be a convenient way to let
>> > > broker
>> > > >> >> and
>> > > >> >> > controller components rebuild their entire in-memory state
>> > > following
>> > > >> an
>> > > >> >> > upgrade." The new version of the software could read both the
>> new
>> > > and
>> > > >> the
>> > > >> >> > old version. Is generating a new snapshot during upgrade
>> needed?
>> > > >> >> >
>> > > >> >> > Jun
>> > > >> >> >
>> > > >> >> >
>> > > >> >> > On Wed, Nov 3, 2021 at 5:42 PM Colin McCabe <
>> cmcc...@apache.org>
>> > > >> wrote:
>> > > >> >> >
>> > > >> >> > > On Tue, Oct 12, 2021, at 10:34, Jun Rao wrote:
>> > > >> >> > > > Hi, David,
>> > > >> >> > > >
>> > > >> >> > > > One more comment.
>> > > >> >> > > >
>> > > >> >> > > > 16. The main reason why KIP-584 requires finalizing a
>> feature
>> > > >> >> manually
>> > > >> >> > is
>> > > >> >> > > > that in the ZK world, the controller doesn't know all
>> brokers
>> > > in a
>> > > >> >> > > cluster.
>> > > >> >> > > > A broker temporarily down is not registered in ZK. in the
>> > KRaft
>> > > >> >> world,
>> > > >> >> > > the
>> > > >> >> > > > controller keeps track of all brokers, including those that
>> > are
>> > > >> >> > > temporarily
>> > > >> >> > > > down. This makes it possible for the controller to
>> > > automatically
>> > > >> >> > > finalize a
>> > > >> >> > > > feature---it's safe to do so when all brokers support that
>> > > >> feature.
>> > > >> >> > This
>> > > >> >> > > > will make the upgrade process much simpler since no manual
>> > > >> command is
>> > > >> >> > > > required to turn on a new feature. Have we considered this?
>> > > >> >> > > >
>> > > >> >> > > > Thanks,
>> > > >> >> > > >
>> > > >> >> > > > Jun
>> > > >> >> > >
>> > > >> >> > > Hi Jun,
>> > > >> >> > >
>> > > >> >> > > I guess David commented on this point already, but I'll
>> comment
>> > > as
>> > > >> >> well.
>> > > >> >> > I
>> > > >> >> > > always had the perception that users viewed rolls as
>> > potentially
>> > > >> risky
>> > > >> >> > and
>> > > >> >> > > were looking for ways to reduce the risk. Not enabling
>> features
>> > > >> right
>> > > >> >> > away
>> > > >> >> > > after installing new software seems like one way to do that.
>> If
>> > > we
>> > > >> had
>> > > >> >> a
>> > > >> >> > > feature to automatically upgrade during a roll, I'm not sure
>> > > that I
>> > > >> >> would
>> > > >> >> > > recommend that people use it, because if something fails, it
>> > > makes
>> > > >> it
>> > > >> >> > > harder to tell if the new feature is at fault, or something
>> > else
>> > > in
>> > > >> the
>> > > >> >> > new
>> > > >> >> > > software.
>> > > >> >> > >
>> > > >> >> > > We already tell users to do a "double roll" when going to a
>> new
>> > > IBP.
>> > > >> >> > (Just
>> > > >> >> > > to give background to people who haven't heard that phrase,
>> the
>> > > >> first
>> > > >> >> > roll
>> > > >> >> > > installs the new software, and the second roll updates the
>> > IBP).
>> > > So
>> > > >> >> this
>> > > >> >> > > KIP-778 mechanism is basically very similar to that, except
>> the
>> > > >> second
>> > > >> >> > > thing isn't a roll, but just an upgrade command. So I think
>> > this
>> > > is
>> > > >> >> > > consistent with what we currently do.
>> > > >> >> > >
>> > > >> >> > > Also, just like David said, we can always add auto-upgrade
>> > later
>> > > if
>> > > >> >> there
>> > > >> >> > > is demand...
>> > > >> >> > >
>> > > >> >> > > best,
>> > > >> >> > > Colin
>> > > >> >> > >
>> > > >> >> > >
>> > > >> >> > > >
>> > > >> >> > > > On Thu, Oct 7, 2021 at 5:19 PM Jun Rao <j...@confluent.io>
>> > > wrote:
>> > > >> >> > > >
>> > > >> >> > > >> Hi, David,
>> > > >> >> > > >>
>> > > >> >> > > >> Thanks for the KIP. A few comments below.
>> > > >> >> > > >>
>> > > >> >> > > >> 10. It would be useful to describe how the controller node
>> > > >> >> determines
>> > > >> >> > > the
>> > > >> >> > > >> RPC version used to communicate to other controller nodes.
>> > > There
>> > > >> >> seems
>> > > >> >> > > to
>> > > >> >> > > >> be a bootstrap problem. A controller node can't read the
>> log
>> > > and
>> > > >> >> > > >> therefore the feature level until a quorum leader is
>> > elected.
>> > > But
>> > > >> >> > leader
>> > > >> >> > > >> election requires an RPC.
>> > > >> >> > > >>
>> > > >> >> > > >> 11. For downgrades, it would be useful to describe how to
>> > > >> determine
>> > > >> >> > the
>> > > >> >> > > >> downgrade process (generating new snapshot, propagating
>> the
>> > > >> >> snapshot,
>> > > >> >> > > etc)
>> > > >> >> > > >> has completed. We could block the UpdateFeature request
>> > until
>> > > the
>> > > >> >> > > process
>> > > >> >> > > >> is completed. However, since the process could take time,
>> > the
>> > > >> >> request
>> > > >> >> > > could
>> > > >> >> > > >> time out. Another way is through DescribeFeature and the
>> > > server
>> > > >> only
>> > > >> >> > > >> reports downgraded versions after the process is
>> completed.
>> > > >> >> > > >>
>> > > >> >> > > >> 12. Since we are changing UpdateFeaturesRequest, do we
>> need
>> > to
>> > > >> >> change
>> > > >> >> > > the
>> > > >> >> > > >> AdminClient api for updateFeatures too?
>> > > >> >> > > >>
>> > > >> >> > > >> 13. For the paragraph starting with "In the absence of an
>> > > >> operator
>> > > >> >> > > >> defined value for metadata.version", in KIP-584, we
>> > described
>> > > >> how to
>> > > >> >> > > >> finalize features with New cluster bootstrap. In that
>> case,
>> > > it's
>> > > >> >> > > >> inconvenient for the users to have to run an admin tool to
>> > > >> finalize
>> > > >> >> > the
>> > > >> >> > > >> version for each feature. Instead, the system detects that
>> > the
>> > > >> >> > /features
>> > > >> >> > > >> path is missing in ZK and thus automatically finalizes
>> every
>> > > >> feature
>> > > >> >> > > with
>> > > >> >> > > >> the latest supported version. Could we do something
>> similar
>> > in
>> > > >> the
>> > > >> >> > KRaft
>> > > >> >> > > >> mode?
>> > > >> >> > > >>
>> > > >> >> > > >> 14. After the quorum leader generates a new snapshot, how
>> do
>> > > we
>> > > >> >> force
>> > > >> >> > > >> other nodes to pick up the new snapshot?
>> > > >> >> > > >>
>> > > >> >> > > >> 15. I agree with Jose that it will be useful to describe
>> > when
>> > > >> >> > > generating a
>> > > >> >> > > >> new snapshot is needed. To me, it seems the new snapshot
>> is
>> > > only
>> > > >> >> > needed
>> > > >> >> > > >> when incompatible changes are made.
>> > > >> >> > > >>
>> > > >> >> > > >> 7. Jose, what control records were you referring?
>> > > >> >> > > >>
>> > > >> >> > > >> Thanks,
>> > > >> >> > > >>
>> > > >> >> > > >> Jun
>> > > >> >> > > >>
>> > > >> >> > > >>
>> > > >> >> > > >> On Tue, Oct 5, 2021 at 8:53 AM David Arthur <
>> > > >> davidart...@apache.org
>> > > >> >> >
>> > > >> >> > > >> wrote:
>> > > >> >> > > >>
>> > > >> >> > > >>> Jose, thanks for the thorough review and comments!
>> > > >> >> > > >>>
>> > > >> >> > > >>> I am out of the office until next week, so I probably
>> won't
>> > > be
>> > > >> able
>> > > >> >> > to
>> > > >> >> > > >>> update the KIP until then. Here are some replies to your
>> > > >> questions:
>> > > >> >> > > >>>
>> > > >> >> > > >>> 1. Generate snapshot on upgrade
>> > > >> >> > > >>> > > Metadata snapshot is generated and sent to the other
>> > > nodes
>> > > >> >> > > >>> > Why does the Active Controller need to generate a new
>> > > snapshot
>> > > >> >> and
>> > > >> >> > > >>> > force a snapshot fetch from the replicas (inactive
>> > > controller
>> > > >> and
>> > > >> >> > > >>> > brokers) on an upgrade? Isn't writing the
>> > > FeatureLevelRecord
>> > > >> good
>> > > >> >> > > >>> > enough to communicate the upgrade to the replicas?
>> > > >> >> > > >>>
>> > > >> >> > > >>>
>> > > >> >> > > >>> You're right, we don't necessarily need to _transmit_ a
>> > > >> snapshot,
>> > > >> >> > since
>> > > >> >> > > >>> each node can generate its own equivalent snapshot
>> > > >> >> > > >>>
>> > > >> >> > > >>> 2. Generate snapshot on downgrade
>> > > >> >> > > >>> > > Metadata snapshot is generated and sent to the other
>> > > >> inactive
>> > > >> >> > > >>> > controllers and to brokers (this snapshot may be
>> lossy!)
>> > > >> >> > > >>> > Why do we need to send this downgraded snapshot to the
>> > > >> brokers?
>> > > >> >> The
>> > > >> >> > > >>> > replicas have seen the FeatureLevelRecord and noticed
>> the
>> > > >> >> > downgrade.
>> > > >> >> > > >>> > Can we have the replicas each independently generate a
>> > > >> downgraded
>> > > >> >> > > >>> > snapshot at the offset for the downgraded
>> > > FeatureLevelRecord?
>> > > >> I
>> > > >> >> > > assume
>> > > >> >> > > >>> > that the active controller will guarantee that all
>> > records
>> > > >> after
>> > > >> >> > the
>> > > >> >> > > >>> > FatureLevelRecord use the downgraded version. If so, it
>> > > would
>> > > >> be
>> > > >> >> > good
>> > > >> >> > > >>> > to mention that explicitly.
>> > > >> >> > > >>>
>> > > >> >> > > >>>
>> > > >> >> > > >>> Similar to above, yes a broker that detects a downgrade
>> via
>> > > >> >> > > >>> FeatureLevelRecord could generate its own downgrade
>> > snapshot
>> > > and
>> > > >> >> > reload
>> > > >> >> > > >>> its
>> > > >> >> > > >>> state from that. This does get a little fuzzy when we
>> > > consider
>> > > >> >> cases
>> > > >> >> > > where
>> > > >> >> > > >>> brokers are on different software versions and could be
>> > > >> generating
>> > > >> >> a
>> > > >> >> > > >>> downgrade snapshot for version X, but using different
>> > > versions
>> > > >> of
>> > > >> >> the
>> > > >> >> > > >>> code.
>> > > >> >> > > >>> It might be safer to let the controller generate the
>> > > snapshot so
>> > > >> >> each
>> > > >> >> > > >>> broker (regardless of software version) gets the same
>> > > records.
>> > > >> >> > However,
>> > > >> >> > > >>> for
>> > > >> >> > > >>> upgrades (or downgrades) we expect the whole cluster to
>> be
>> > > >> running
>> > > >> >> > the
>> > > >> >> > > >>> same
>> > > >> >> > > >>> software version before triggering the metadata.version
>> > > change,
>> > > >> so
>> > > >> >> > > perhaps
>> > > >> >> > > >>> this isn't a likely scenario. Thoughts?
>> > > >> >> > > >>>
>> > > >> >> > > >>>
>> > > >> >> > > >>> 3. Max metadata version
>> > > >> >> > > >>> > >For the first release that supports metadata.version,
>> we
>> > > can
>> > > >> >> > simply
>> > > >> >> > > >>> > initialize metadata.version with the current (and only)
>> > > >> version.
>> > > >> >> > For
>> > > >> >> > > >>> future
>> > > >> >> > > >>> > releases, we will need a mechanism to bootstrap a
>> > > particular
>> > > >> >> > version.
>> > > >> >> > > >>> This
>> > > >> >> > > >>> > could be done using the meta.properties file or some
>> > > similar
>> > > >> >> > > mechanism.
>> > > >> >> > > >>> The
>> > > >> >> > > >>> > reason we need the allow for a specific initial version
>> > is
>> > > to
>> > > >> >> > support
>> > > >> >> > > >>> the
>> > > >> >> > > >>> > use case of starting a Kafka cluster at version X with
>> an
>> > > >> older
>> > > >> >> > > >>> > metadata.version.
>> > > >> >> > > >>>
>> > > >> >> > > >>>
>> > > >> >> > > >>> I assume that the Active Controller will learn the
>> metadata
>> > > >> version
>> > > >> >> > of
>> > > >> >> > > >>> > the broker through the BrokerRegistrationRequest. How
>> > will
>> > > the
>> > > >> >> > Active
>> > > >> >> > > >>> > Controller learn about the max metadata version of the
>> > > >> inactive
>> > > >> >> > > >>> > controller nodes? We currently don't send a
>> registration
>> > > >> request
>> > > >> >> > from
>> > > >> >> > > >>> > the inactive controller to the active controller.
>> > > >> >> > > >>>
>> > > >> >> > > >>>
>> > > >> >> > > >>> This came up during the design, but I neglected to add it
>> > to
>> > > the
>> > > >> >> KIP.
>> > > >> >> > > We
>> > > >> >> > > >>> will need a mechanism for determining the supported
>> > features
>> > > of
>> > > >> >> each
>> > > >> >> > > >>> controller similar to how brokers use
>> > > BrokerRegistrationRequest.
>> > > >> >> > > Perhaps
>> > > >> >> > > >>> controllers could write a FeatureLevelRecord (or similar)
>> > to
>> > > the
>> > > >> >> > > metadata
>> > > >> >> > > >>> log indicating their supported version. WDYT?
>> > > >> >> > > >>>
>> > > >> >> > > >>> Why do you need to bootstrap a particular version? Isn't
>> > the
>> > > >> intent
>> > > >> >> > > >>> > that the broker will learn the active metadata version
>> by
>> > > >> reading
>> > > >> >> > the
>> > > >> >> > > >>> > metadata before unfencing?
>> > > >> >> > > >>>
>> > > >> >> > > >>>
>> > > >> >> > > >>> This bootstrapping is needed for when a KRaft cluster is
>> > > first
>> > > >> >> > > started. If
>> > > >> >> > > >>> we don't have this mechanism, the cluster can't really do
>> > > >> anything
>> > > >> >> > > until
>> > > >> >> > > >>> the operator finalizes the metadata.version with the
>> tool.
>> > > The
>> > > >> >> > > >>> bootstrapping will be done by the controller and the
>> > brokers
>> > > >> will
>> > > >> >> see
>> > > >> >> > > this
>> > > >> >> > > >>> version as a record (like you say). I'll add some text to
>> > > >> clarify
>> > > >> >> > this.
>> > > >> >> > > >>>
>> > > >> >> > > >>>
>> > > >> >> > > >>> 4. Reject Registration - This is related to the bullet
>> > point
>> > > >> above.
>> > > >> >> > > >>> > What will be the behavior of the active controller if
>> the
>> > > >> broker
>> > > >> >> > > sends
>> > > >> >> > > >>> > a metadata version that is not compatible with the
>> > cluster
>> > > >> wide
>> > > >> >> > > >>> > metadata version?
>> > > >> >> > > >>>
>> > > >> >> > > >>>
>> > > >> >> > > >>> If a broker starts up with a lower supported version
>> range
>> > > than
>> > > >> the
>> > > >> >> > > >>> current
>> > > >> >> > > >>> cluster metadata.version, it should log an error and
>> > > shutdown.
>> > > >> This
>> > > >> >> > is
>> > > >> >> > > in
>> > > >> >> > > >>> line with KIP-584.
>> > > >> >> > > >>>
>> > > >> >> > > >>> 5. Discover upgrade
>> > > >> >> > > >>> > > This snapshot will be a convenient way to let broker
>> > and
>> > > >> >> > controller
>> > > >> >> > > >>> > components rebuild their entire in-memory state
>> following
>> > > an
>> > > >> >> > upgrade.
>> > > >> >> > > >>> > Can we rely on the presence of the FeatureLevelRecord
>> for
>> > > the
>> > > >> >> > > metadata
>> > > >> >> > > >>> > version for this functionality? If so, it avoids having
>> > to
>> > > >> reload
>> > > >> >> > the
>> > > >> >> > > >>> > snapshot.
>> > > >> >> > > >>>
>> > > >> >> > > >>>
>> > > >> >> > > >>> For upgrades, yes probably since we won't need to
>> "rewrite"
>> > > any
>> > > >> >> > > records in
>> > > >> >> > > >>> this case. For downgrades, we will need to generate the
>> > > snapshot
>> > > >> >> and
>> > > >> >> > > >>> reload
>> > > >> >> > > >>> everything.
>> > > >> >> > > >>>
>> > > >> >> > > >>> 6. Metadata version specification
>> > > >> >> > > >>> > >  V4(version=4, isBackwardsCompatible=false,
>> > > description="New
>> > > >> >> > > metadata
>> > > >> >> > > >>> > record type Bar"),
>> > > >> >> > > >>>
>> > > >> >> > > >>>
>> > > >> >> > > >>> Very cool. Do you have plans to generate Apache Kafka
>> HTML
>> > > >> >> > > >>> > documentation for this information? Would be helpful to
>> > > >> display
>> > > >> >> > this
>> > > >> >> > > >>> > information to the user using the kafka-features.sh and
>> > > >> feature
>> > > >> >> > RPC?
>> > > >> >> > > >>>
>> > > >> >> > > >>>
>> > > >> >> > > >>> Hm good idea :) I'll add a brief section on
>> documentation.
>> > > This
>> > > >> >> would
>> > > >> >> > > >>> certainly be very useful
>> > > >> >> > > >>>
>> > > >> >> > > >>> 7.Downgrade records
>> > > >> >> > > >>> > I think we should explicitly mention that the downgrade
>> > > >> process
>> > > >> >> > will
>> > > >> >> > > >>> > downgrade both metadata records and controller records.
>> > In
>> > > >> >> KIP-630
>> > > >> >> > we
>> > > >> >> > > >>> > introduced two control records for snapshots.
>> > > >> >> > > >>>
>> > > >> >> > > >>>
>> > > >> >> > > >>> Yes, good call. Let me re-read that KIP and include some
>> > > >> details.
>> > > >> >> > > >>>
>> > > >> >> > > >>> Thanks again for the comments!
>> > > >> >> > > >>> -David
>> > > >> >> > > >>>
>> > > >> >> > > >>>
>> > > >> >> > > >>> On Wed, Sep 29, 2021 at 5:09 PM José Armando García
>> Sancio
>> > > >> >> > > >>> <jsan...@confluent.io.invalid> wrote:
>> > > >> >> > > >>>
>> > > >> >> > > >>> > One more comment.
>> > > >> >> > > >>> >
>> > > >> >> > > >>> > 7.Downgrade records
>> > > >> >> > > >>> > I think we should explicitly mention that the downgrade
>> > > >> process
>> > > >> >> > will
>> > > >> >> > > >>> > downgrade both metadata records and controller records.
>> > In
>> > > >> >> KIP-630
>> > > >> >> > we
>> > > >> >> > > >>> > introduced two control records for snapshots.
>> > > >> >> > > >>> >
>> > > >> >> > > >>>
>> > > >> >> > > >>
>> > > >> >> > >
>> > > >> >> >
>> > > >> >>
>> > > >> >>
>> > > >> >> --
>> > > >> >> -David
>> > > >> >>
>> > > >>
>> > >
>> >
>>
>
>
> -- 
> -David

Reply via email to