Jason, 1/2. You've got it right. The intention is that metadata.version will gate both RPCs (like IBP does) as well as metadata records. So, when a broker sees that metadata.version changed, it may start advertising new RPCs and it will need to refresh the ApiVersions it has for other brokers. I can try to make this more explicit in the KIP.
3. Yes, that's the intention of the --dry-run flag. The current implementation in kafka-features.sh does a dry run on the client side, but this KIP pushes the validation down to the controller. This will allow us to have the context needed to do proper validation Re: version number complexity -- yes this has come up in offline discussions. With just one feature flag to manage, it's not so bad, but explicitly managing even a few would be a burden. I like your suggestion of the "--release" flag. That could act as a sort of manifest of versions (i.e., the defaults from that release). We could also support something like "kafka-features upgrade --release latest" to bring everything to the highest supported version. On Wed, Nov 17, 2021 at 9:36 PM Jason Gustafson <ja...@confluent.io.invalid> wrote: > Hi Colin/David, > > > Like David said, basically it boils down to creating a feature flag for > the new proposed __consumer_offsets version, or using a new > IBP/metadata.version for it. Both approaches have pros and cons. Using an > IBP/metadata.version bump reduces the size of the testing matrix. But using > a feature flag allows people to avoid any bugs or pain associated with the > change if they don't care about enabling it. This is basically the classic > "should I use a feature flag or not?" discussion and we need to have it on > a case-by-case basis. > > I think most users are not going to care to manage versions for a bunch of > different features. The IBP today has many shortcomings, but at least it's > tied to a version that users understand (i.e. the release version). How > would users know after upgrading to Kafka 3.1, for example, that they need > to upgrade the metadata.version to 3 and offsets.version to 4 (or > whatever)? It's a lot of overhead trying to understand all of the potential > features and what each upgrade actually means to them. I am wondering if we > could give them something more convenient which is tied to the release > version. For example, maybe we could use a command like `kafka-features > upgrade --release 3.1`, which the broker would then translate to an upgrade > to the latest versions of the individual features at the time of the 3.1 > release. Basically it's just a static map from release version to feature > versions to make the upgrade process more convenient. > > Thanks, > Jason > > > > > On Wed, Nov 17, 2021 at 6:20 PM Jason Gustafson <ja...@confluent.io> > wrote: > > > A few additional questions: > > > > 1. Currently the IBP tells us what version of individual inter-broker > RPCs > > will be used. I think the plan in this KIP is to use ApiVersions request > > instead to find the highest compatible version (just like clients). Do I > > have that right? > > > > 2. The following wasn't very clear to me: > > > > > Brokers will be able to observe changes to metadata.version by > observing > > the metadata log, and could then submit a new ApiVersionsRequest to the > > other Kafka nodes. > > > > Is the purpose of submitting new ApiVersions requests to update the > > features or the RPC versions? Does metadata.version also influence the > > versions that a broker advertises? It would help to have more detail > about > > this. > > > > 3. I imagine users will want to know before performing an upgrade whether > > downgrading will be safe. Would the --dry-run flag tell them this? > > > > Thanks, > > Jason > > > > > > > > > > > > On Wed, Nov 17, 2021 at 3:55 PM Colin McCabe <cmcc...@apache.org> wrote: > > > >> On Wed, Nov 17, 2021, at 11:28, Jason Gustafson wrote: > >> > Hi David, > >> > > >> > Forgive me if this ground has been covered already. Today, we have a > few > >> > other things that we have latched onto the IBP, such as upgrades to > the > >> > format of records in __consumer_offsets. I've been assuming that > >> > metadata.version is not covering this. Is that right or is there some > >> other > >> > plan to take care of cases like this? > >> > > >> > >> I think metadata.version could cover changes to things like > >> __consumer_offsets, if people want it to. Or to put it another way, > that is > >> out of scope for this KIP. > >> > >> Like David said, basically it boils down to creating a feature flag for > >> the new proposed __consumer_offsets version, or using a new > >> IBP/metadata.version for it. Both approaches have pros and cons. Using > an > >> IBP/metadata.version bump reduces the size of the testing matrix. But > using > >> a feature flag allows people to avoid any bugs or pain associated with > the > >> change if they don't care about enabling it. This is basically the > classic > >> "should I use a feature flag or not?" discussion and we need to have it > on > >> a case-by-case basis. > >> > >> I think it's worth calling out that having a 1:1 mapping between IBP > >> versions and metadata.versions will result in some metadata.versions > that > >> "don't do anything" (aka they do the same thing as the previous > >> metadata.version). For example, if we change StopReplicaRequest again, > that > >> will not affect KRaft mode, but probably would require an IBP bump and > >> hence a metadata.version bump. I think that's OK. It's not that > different > >> from updating your IBP and getting support for ZStandard, when your > >> deployment doesn't use ZStandard compression. > >> > >> best, > >> Colin > >> > >> > Thanks, > >> > Jason > >> > > >> > > >> > > >> > On Wed, Nov 17, 2021 at 10:17 AM Jun Rao <j...@confluent.io.invalid> > >> wrote: > >> > > >> >> Hi, Colin, > >> >> > >> >> Thanks for the reply. > >> >> > >> >> For case b, I am not sure that I understand your suggestion. Does > "each > >> >> subsequent level for metadata.version corresponds to an IBP version" > >> mean > >> >> that we need to keep IBP forever? Could you describe the upgrade > >> process in > >> >> this case? > >> >> > >> >> Jun > >> >> > >> >> On Tue, Nov 16, 2021 at 3:45 PM Colin McCabe <cmcc...@apache.org> > >> wrote: > >> >> > >> >> > On Tue, Nov 16, 2021, at 15:13, Jun Rao wrote: > >> >> > > Hi, David, Colin, > >> >> > > > >> >> > > Thanks for the reply. > >> >> > > > >> >> > > 16. Discussed with David offline a bit. We have 3 cases. > >> >> > > a. We upgrade from an old version where the metadata.version has > >> >> already > >> >> > > been finalized. In this case it makes sense to stay with that > >> feature > >> >> > > version after the upgrade. > >> >> > > >> >> > +1 > >> >> > > >> >> > > b. We upgrade from an old version where no metadata.version has > >> been > >> >> > > finalized. In this case, it makes sense to leave metadata.version > >> >> > disabled > >> >> > > since we don't know if all brokers have been upgraded. > >> >> > > >> >> > This is the scenario I was hoping to avoid by saying that ALL KRaft > >> >> > clusters have metadata.version of at least 1, and each subsequent > >> level > >> >> for > >> >> > metadata.version corresponds to an IBP version. The existing KRaft > >> >> clusters > >> >> > in 3.0 and earlier are preview (not for production) so I think this > >> >> change > >> >> > is OK for 3.x (given that it affects only KRaft). Then IBP is > >> irrelevant > >> >> > for KRaft clusters (the config is ignored, possibly with a WARN or > >> ERROR > >> >> > message generated if it is set). > >> >> > > >> >> > > c. We are starting from a brand new cluster and of course no > >> >> > > metadata.version has been finalized. In this case, the KIP says > it > >> will > >> >> > > pick the metadata.version in meta.properties. In the common case, > >> >> people > >> >> > > probably won't set the metadata.version in the meta.properties > file > >> >> > > explicitly. So, it will be useful to put a default (stable) > version > >> >> there > >> >> > > when the meta.properties. > >> >> > > >> >> > Hmm. I was assuming that clusters where the admin didn't specify > any > >> >> > metadata.version during formatting would get the latest > >> metadata.version. > >> >> > Partly, because this is what we do for IBP today. It would be good > to > >> >> > clarify this... > >> >> > > >> >> > > > >> >> > > Also, it would be useful to clarify that if a FeatureLevelRecord > >> exists > >> >> > for > >> >> > > metadata.version, the metadata.version in meta.properties will be > >> >> > ignored. > >> >> > > > >> >> > > >> >> > Yeah, I agree. > >> >> > > >> >> > best, > >> >> > Colin > >> >> > > >> >> > > Thanks, > >> >> > > > >> >> > > Jun > >> >> > > > >> >> > > > >> >> > > On Tue, Nov 16, 2021 at 12:39 PM Colin McCabe < > cmcc...@apache.org> > >> >> > wrote: > >> >> > > > >> >> > >> On Fri, Nov 5, 2021, at 15:18, Jun Rao wrote: > >> >> > >> > Hi, David, > >> >> > >> > > >> >> > >> > Thanks for the reply. > >> >> > >> > > >> >> > >> > 16. My first concern is that the KIP picks up meta.version > >> >> > inconsistently > >> >> > >> > during the deployment. If a new cluster is started, we pick up > >> the > >> >> > >> highest > >> >> > >> > version. If we upgrade, we leave the feature version > unchanged. > >> >> > >> > >> >> > >> Hi Jun, > >> >> > >> > >> >> > >> Thanks again for taking a look. > >> >> > >> > >> >> > >> The proposed behavior in KIP-778 is consistent with how it works > >> >> today. > >> >> > >> Upgrading the software is distinct from upgrading the IBP. > >> >> > >> > >> >> > >> I think it is important to keep these two operations ("upgrading > >> >> > >> IBP/metadata version" and "upgrading software version") > separate. > >> If > >> >> > they > >> >> > >> are coupled it will create a situation where software upgrades > are > >> >> > >> difficult and dangerous. > >> >> > >> > >> >> > >> Consider a situation where you find some bug in your current > >> software, > >> >> > and > >> >> > >> you want to upgrade to new software that fixes the bug. If > >> upgrades > >> >> and > >> >> > IBP > >> >> > >> bumps are coupled, you can't do this without also bumping the > IBP, > >> >> > which is > >> >> > >> usually considered a high-risk change. That means that either > you > >> have > >> >> > to > >> >> > >> make a special build that includes only the fix (time-consuming > >> and > >> >> > >> error-prone), live with the bug for longer, or be very > >> conservative > >> >> > about > >> >> > >> ever introducing new IBP/metadata versions. None of those are > >> really > >> >> > good > >> >> > >> choices. > >> >> > >> > >> >> > >> > Intuitively, it seems that independent of how a cluster is > >> deployed, > >> >> > we > >> >> > >> > should always pick the same feature version. > >> >> > >> > >> >> > >> I think it makes sense to draw a distinction between upgrading > an > >> >> > existing > >> >> > >> cluster and deploying a new one. What most people want out of > >> upgrades > >> >> > is > >> >> > >> that things should keep working, but with bug fixes. If we > change > >> >> that, > >> >> > it > >> >> > >> just makes people more reluctant to upgrade (which is always a > >> >> > problem...) > >> >> > >> > >> >> > >> > I think we need to think this through in this KIP. My second > >> concern > >> >> > is > >> >> > >> > that as a particular version matures, it's inconvenient for a > >> user > >> >> to > >> >> > >> manually > >> >> > >> > upgrade every feature version. As long as we have a path to > >> achieve > >> >> > that > >> >> > >> in > >> >> > >> > the future, we don't need to address that in this KIP. > >> >> > >> > >> >> > >> If people are managing a large number of Kafka clusters, they > will > >> >> want > >> >> > to > >> >> > >> do some sort of A/B testing with IBP/metadata versions. So if > you > >> have > >> >> > 1000 > >> >> > >> Kafka clusters, you roll out the new IBP version to 10 of them > >> and see > >> >> > how > >> >> > >> it goes. If that goes well, you roll it out to more, etc. > >> >> > >> > >> >> > >> So, the automation needs to be at the cluster management layer, > >> not at > >> >> > the > >> >> > >> Kafka layer. Each Kafka cluster doesn't know how well things > went > >> in > >> >> the > >> >> > >> other 999 clusters. Automatically upgrading is a bad thing for > the > >> >> same > >> >> > >> reason Kafka automatically upgrading its own software version > >> would > >> >> be a > >> >> > >> bad thing -- it could lead to a disruption to a sensitive > cluster > >> at > >> >> the > >> >> > >> wrong time. > >> >> > >> > >> >> > >> For people who are just managing one or two Kafka clusters, > >> >> > automatically > >> >> > >> upgrading feature versions isn't a big burden and can be done > >> >> manually. > >> >> > >> This is all consistent with how IBP works today. > >> >> > >> > >> >> > >> Also, we already have a command-line option to the feature tool > >> which > >> >> > >> upgrades all features to the latest available, if that is what > the > >> >> > >> administrator desires. Perhaps we could add documentation saying > >> that > >> >> > this > >> >> > >> should be done as the last step of the upgrade. > >> >> > >> > >> >> > >> best, > >> >> > >> Colin > >> >> > >> > >> >> > >> > > >> >> > >> > 21. "./kafka-features.sh delete": Deleting a feature seems a > bit > >> >> weird > >> >> > >> > since the logic is always there. Would it be better to use > >> disable? > >> >> > >> > > >> >> > >> > Jun > >> >> > >> > > >> >> > >> > On Fri, Nov 5, 2021 at 8:11 AM David Arthur > >> >> > >> > <david.art...@confluent.io.invalid> wrote: > >> >> > >> > > >> >> > >> >> Colin and Jun, thanks for the additional comments! > >> >> > >> >> > >> >> > >> >> Colin: > >> >> > >> >> > >> >> > >> >> > We've been talking about having an automated RPC > >> compatibility > >> >> > checker > >> >> > >> >> > >> >> > >> >> Do we have a way to mark fields in schemas as deprecated? It > >> can > >> >> > stay in > >> >> > >> >> the RPC, it just complicates the logic a bit. > >> >> > >> >> > >> >> > >> >> > It would be nice if the active controller could validate > >> that a > >> >> > >> majority > >> >> > >> >> of the quorum could use the proposed metadata.version. The > >> active > >> >> > >> >> controller should have this information, right? If we don't > >> have > >> >> > recent > >> >> > >> >> information from a quorum of voters, we wouldn't be active. > >> >> > >> >> > >> >> > >> >> I believe we should have this information from the > >> >> > ApiVersionsResponse. > >> >> > >> It > >> >> > >> >> would be good to do this validation to avoid a situation > where > >> a > >> >> > >> >> quorum leader can't be elected due to unprocessable records. > >> >> > >> >> > >> >> > >> >> > Do we need delete as a command separate from downgrade? > >> >> > >> >> > >> >> > >> >> I think from an operator's perspective, it is nice to > >> distinguish > >> >> > >> between > >> >> > >> >> changing a feature flag and unsetting it. It might be > >> surprising to > >> >> > an > >> >> > >> >> operator to see the flag's version set to nothing when they > >> >> requested > >> >> > >> the > >> >> > >> >> downgrade to version 0 (or less). > >> >> > >> >> > >> >> > >> >> > it seems like we should spell out that metadata.version > >> begins at > >> >> > 1 in > >> >> > >> >> KRaft clusters > >> >> > >> >> > >> >> > >> >> I added this text: > >> >> > >> >> > >> >> > >> >> Introduce an IBP version to indicate the lowest software > >> version > >> >> that > >> >> > >> >> > supports *metadata.version*. Below this IBP, the > >> >> > *metadata.version* is > >> >> > >> >> > undefined and will not be examined. At or above this IBP, > the > >> >> > >> >> > *metadata.version* must be *0* for ZooKeeper clusters and > >> will be > >> >> > >> >> > initialized as *1* for KRaft clusters. > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > We probably also want an RPC implemented by both brokers > and > >> >> > >> controllers > >> >> > >> >> that will reveal the min and max supported versions for each > >> >> feature > >> >> > >> level > >> >> > >> >> supported by the server > >> >> > >> >> > >> >> > >> >> This is available in ApiVersionsResponse (we include the > >> server's > >> >> > >> supported > >> >> > >> >> features as well as the cluster's finalized features) > >> >> > >> >> > >> >> > >> >> -------- > >> >> > >> >> > >> >> > >> >> Jun: > >> >> > >> >> > >> >> > >> >> 12. I've updated the KIP with AdminClient changes > >> >> > >> >> > >> >> > >> >> 14. You're right, it looks like I missed a few sections > >> regarding > >> >> > >> snapshot > >> >> > >> >> generation. I've corrected it > >> >> > >> >> > >> >> > >> >> 16. This feels more like an enhancement to KIP-584. I agree > it > >> >> could > >> >> > be > >> >> > >> >> useful, but perhaps we could address it separately from KRaft > >> >> > upgrades? > >> >> > >> >> > >> >> > >> >> 20. Indeed snapshots are not strictly necessary during an > >> upgrade, > >> >> > I've > >> >> > >> >> reworded this > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> Thanks! > >> >> > >> >> David > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> On Thu, Nov 4, 2021 at 6:51 PM Jun Rao > >> <j...@confluent.io.invalid> > >> >> > >> wrote: > >> >> > >> >> > >> >> > >> >> > Hi, David, Jose and Colin, > >> >> > >> >> > > >> >> > >> >> > Thanks for the reply. A few more comments. > >> >> > >> >> > > >> >> > >> >> > 12. It seems that we haven't updated the AdminClient > >> accordingly? > >> >> > >> >> > > >> >> > >> >> > 14. "Metadata snapshot is generated and sent to the other > >> >> inactive > >> >> > >> >> > controllers and to brokers". I thought we wanted each > broker > >> to > >> >> > >> generate > >> >> > >> >> > its own snapshot independently? If only the controller > >> generates > >> >> > the > >> >> > >> >> > snapshot, how do we force other brokers to pick it up? > >> >> > >> >> > > >> >> > >> >> > 16. If a feature version is new, one may not want to enable > >> it > >> >> > >> >> immediately > >> >> > >> >> > after the cluster is upgraded. However, if a feature > version > >> has > >> >> > been > >> >> > >> >> > stable, requiring every user to run a command to upgrade to > >> that > >> >> > >> version > >> >> > >> >> > seems inconvenient. One way to improve this is for each > >> feature > >> >> to > >> >> > >> define > >> >> > >> >> > one version as the default. Then, when we upgrade a > cluster, > >> we > >> >> > will > >> >> > >> >> > automatically upgrade the feature to the default version. > An > >> >> admin > >> >> > >> could > >> >> > >> >> > use the tool to upgrade to a version higher than the > default. > >> >> > >> >> > > >> >> > >> >> > 20. "The quorum controller can assist with this process by > >> >> > generating > >> >> > >> a > >> >> > >> >> > metadata snapshot after a metadata.version increase has > been > >> >> > >> committed to > >> >> > >> >> > the metadata log. This snapshot will be a convenient way to > >> let > >> >> > broker > >> >> > >> >> and > >> >> > >> >> > controller components rebuild their entire in-memory state > >> >> > following > >> >> > >> an > >> >> > >> >> > upgrade." The new version of the software could read both > >> the new > >> >> > and > >> >> > >> the > >> >> > >> >> > old version. Is generating a new snapshot during upgrade > >> needed? > >> >> > >> >> > > >> >> > >> >> > Jun > >> >> > >> >> > > >> >> > >> >> > > >> >> > >> >> > On Wed, Nov 3, 2021 at 5:42 PM Colin McCabe < > >> cmcc...@apache.org> > >> >> > >> wrote: > >> >> > >> >> > > >> >> > >> >> > > On Tue, Oct 12, 2021, at 10:34, Jun Rao wrote: > >> >> > >> >> > > > Hi, David, > >> >> > >> >> > > > > >> >> > >> >> > > > One more comment. > >> >> > >> >> > > > > >> >> > >> >> > > > 16. The main reason why KIP-584 requires finalizing a > >> feature > >> >> > >> >> manually > >> >> > >> >> > is > >> >> > >> >> > > > that in the ZK world, the controller doesn't know all > >> brokers > >> >> > in a > >> >> > >> >> > > cluster. > >> >> > >> >> > > > A broker temporarily down is not registered in ZK. in > the > >> >> KRaft > >> >> > >> >> world, > >> >> > >> >> > > the > >> >> > >> >> > > > controller keeps track of all brokers, including those > >> that > >> >> are > >> >> > >> >> > > temporarily > >> >> > >> >> > > > down. This makes it possible for the controller to > >> >> > automatically > >> >> > >> >> > > finalize a > >> >> > >> >> > > > feature---it's safe to do so when all brokers support > >> that > >> >> > >> feature. > >> >> > >> >> > This > >> >> > >> >> > > > will make the upgrade process much simpler since no > >> manual > >> >> > >> command is > >> >> > >> >> > > > required to turn on a new feature. Have we considered > >> this? > >> >> > >> >> > > > > >> >> > >> >> > > > Thanks, > >> >> > >> >> > > > > >> >> > >> >> > > > Jun > >> >> > >> >> > > > >> >> > >> >> > > Hi Jun, > >> >> > >> >> > > > >> >> > >> >> > > I guess David commented on this point already, but I'll > >> comment > >> >> > as > >> >> > >> >> well. > >> >> > >> >> > I > >> >> > >> >> > > always had the perception that users viewed rolls as > >> >> potentially > >> >> > >> risky > >> >> > >> >> > and > >> >> > >> >> > > were looking for ways to reduce the risk. Not enabling > >> features > >> >> > >> right > >> >> > >> >> > away > >> >> > >> >> > > after installing new software seems like one way to do > >> that. If > >> >> > we > >> >> > >> had > >> >> > >> >> a > >> >> > >> >> > > feature to automatically upgrade during a roll, I'm not > >> sure > >> >> > that I > >> >> > >> >> would > >> >> > >> >> > > recommend that people use it, because if something fails, > >> it > >> >> > makes > >> >> > >> it > >> >> > >> >> > > harder to tell if the new feature is at fault, or > something > >> >> else > >> >> > in > >> >> > >> the > >> >> > >> >> > new > >> >> > >> >> > > software. > >> >> > >> >> > > > >> >> > >> >> > > We already tell users to do a "double roll" when going to > >> a new > >> >> > IBP. > >> >> > >> >> > (Just > >> >> > >> >> > > to give background to people who haven't heard that > >> phrase, the > >> >> > >> first > >> >> > >> >> > roll > >> >> > >> >> > > installs the new software, and the second roll updates > the > >> >> IBP). > >> >> > So > >> >> > >> >> this > >> >> > >> >> > > KIP-778 mechanism is basically very similar to that, > >> except the > >> >> > >> second > >> >> > >> >> > > thing isn't a roll, but just an upgrade command. So I > think > >> >> this > >> >> > is > >> >> > >> >> > > consistent with what we currently do. > >> >> > >> >> > > > >> >> > >> >> > > Also, just like David said, we can always add > auto-upgrade > >> >> later > >> >> > if > >> >> > >> >> there > >> >> > >> >> > > is demand... > >> >> > >> >> > > > >> >> > >> >> > > best, > >> >> > >> >> > > Colin > >> >> > >> >> > > > >> >> > >> >> > > > >> >> > >> >> > > > > >> >> > >> >> > > > On Thu, Oct 7, 2021 at 5:19 PM Jun Rao < > j...@confluent.io > >> > > >> >> > wrote: > >> >> > >> >> > > > > >> >> > >> >> > > >> Hi, David, > >> >> > >> >> > > >> > >> >> > >> >> > > >> Thanks for the KIP. A few comments below. > >> >> > >> >> > > >> > >> >> > >> >> > > >> 10. It would be useful to describe how the controller > >> node > >> >> > >> >> determines > >> >> > >> >> > > the > >> >> > >> >> > > >> RPC version used to communicate to other controller > >> nodes. > >> >> > There > >> >> > >> >> seems > >> >> > >> >> > > to > >> >> > >> >> > > >> be a bootstrap problem. A controller node can't read > >> the log > >> >> > and > >> >> > >> >> > > >> therefore the feature level until a quorum leader is > >> >> elected. > >> >> > But > >> >> > >> >> > leader > >> >> > >> >> > > >> election requires an RPC. > >> >> > >> >> > > >> > >> >> > >> >> > > >> 11. For downgrades, it would be useful to describe how > >> to > >> >> > >> determine > >> >> > >> >> > the > >> >> > >> >> > > >> downgrade process (generating new snapshot, > propagating > >> the > >> >> > >> >> snapshot, > >> >> > >> >> > > etc) > >> >> > >> >> > > >> has completed. We could block the UpdateFeature > request > >> >> until > >> >> > the > >> >> > >> >> > > process > >> >> > >> >> > > >> is completed. However, since the process could take > >> time, > >> >> the > >> >> > >> >> request > >> >> > >> >> > > could > >> >> > >> >> > > >> time out. Another way is through DescribeFeature and > the > >> >> > server > >> >> > >> only > >> >> > >> >> > > >> reports downgraded versions after the process is > >> completed. > >> >> > >> >> > > >> > >> >> > >> >> > > >> 12. Since we are changing UpdateFeaturesRequest, do we > >> need > >> >> to > >> >> > >> >> change > >> >> > >> >> > > the > >> >> > >> >> > > >> AdminClient api for updateFeatures too? > >> >> > >> >> > > >> > >> >> > >> >> > > >> 13. For the paragraph starting with "In the absence of > >> an > >> >> > >> operator > >> >> > >> >> > > >> defined value for metadata.version", in KIP-584, we > >> >> described > >> >> > >> how to > >> >> > >> >> > > >> finalize features with New cluster bootstrap. In that > >> case, > >> >> > it's > >> >> > >> >> > > >> inconvenient for the users to have to run an admin > tool > >> to > >> >> > >> finalize > >> >> > >> >> > the > >> >> > >> >> > > >> version for each feature. Instead, the system detects > >> that > >> >> the > >> >> > >> >> > /features > >> >> > >> >> > > >> path is missing in ZK and thus automatically finalizes > >> every > >> >> > >> feature > >> >> > >> >> > > with > >> >> > >> >> > > >> the latest supported version. Could we do something > >> similar > >> >> in > >> >> > >> the > >> >> > >> >> > KRaft > >> >> > >> >> > > >> mode? > >> >> > >> >> > > >> > >> >> > >> >> > > >> 14. After the quorum leader generates a new snapshot, > >> how do > >> >> > we > >> >> > >> >> force > >> >> > >> >> > > >> other nodes to pick up the new snapshot? > >> >> > >> >> > > >> > >> >> > >> >> > > >> 15. I agree with Jose that it will be useful to > describe > >> >> when > >> >> > >> >> > > generating a > >> >> > >> >> > > >> new snapshot is needed. To me, it seems the new > >> snapshot is > >> >> > only > >> >> > >> >> > needed > >> >> > >> >> > > >> when incompatible changes are made. > >> >> > >> >> > > >> > >> >> > >> >> > > >> 7. Jose, what control records were you referring? > >> >> > >> >> > > >> > >> >> > >> >> > > >> Thanks, > >> >> > >> >> > > >> > >> >> > >> >> > > >> Jun > >> >> > >> >> > > >> > >> >> > >> >> > > >> > >> >> > >> >> > > >> On Tue, Oct 5, 2021 at 8:53 AM David Arthur < > >> >> > >> davidart...@apache.org > >> >> > >> >> > > >> >> > >> >> > > >> wrote: > >> >> > >> >> > > >> > >> >> > >> >> > > >>> Jose, thanks for the thorough review and comments! > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> I am out of the office until next week, so I probably > >> won't > >> >> > be > >> >> > >> able > >> >> > >> >> > to > >> >> > >> >> > > >>> update the KIP until then. Here are some replies to > >> your > >> >> > >> questions: > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> 1. Generate snapshot on upgrade > >> >> > >> >> > > >>> > > Metadata snapshot is generated and sent to the > >> other > >> >> > nodes > >> >> > >> >> > > >>> > Why does the Active Controller need to generate a > new > >> >> > snapshot > >> >> > >> >> and > >> >> > >> >> > > >>> > force a snapshot fetch from the replicas (inactive > >> >> > controller > >> >> > >> and > >> >> > >> >> > > >>> > brokers) on an upgrade? Isn't writing the > >> >> > FeatureLevelRecord > >> >> > >> good > >> >> > >> >> > > >>> > enough to communicate the upgrade to the replicas? > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> You're right, we don't necessarily need to > _transmit_ a > >> >> > >> snapshot, > >> >> > >> >> > since > >> >> > >> >> > > >>> each node can generate its own equivalent snapshot > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> 2. Generate snapshot on downgrade > >> >> > >> >> > > >>> > > Metadata snapshot is generated and sent to the > >> other > >> >> > >> inactive > >> >> > >> >> > > >>> > controllers and to brokers (this snapshot may be > >> lossy!) > >> >> > >> >> > > >>> > Why do we need to send this downgraded snapshot to > >> the > >> >> > >> brokers? > >> >> > >> >> The > >> >> > >> >> > > >>> > replicas have seen the FeatureLevelRecord and > >> noticed the > >> >> > >> >> > downgrade. > >> >> > >> >> > > >>> > Can we have the replicas each independently > generate > >> a > >> >> > >> downgraded > >> >> > >> >> > > >>> > snapshot at the offset for the downgraded > >> >> > FeatureLevelRecord? > >> >> > >> I > >> >> > >> >> > > assume > >> >> > >> >> > > >>> > that the active controller will guarantee that all > >> >> records > >> >> > >> after > >> >> > >> >> > the > >> >> > >> >> > > >>> > FatureLevelRecord use the downgraded version. If > so, > >> it > >> >> > would > >> >> > >> be > >> >> > >> >> > good > >> >> > >> >> > > >>> > to mention that explicitly. > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> Similar to above, yes a broker that detects a > >> downgrade via > >> >> > >> >> > > >>> FeatureLevelRecord could generate its own downgrade > >> >> snapshot > >> >> > and > >> >> > >> >> > reload > >> >> > >> >> > > >>> its > >> >> > >> >> > > >>> state from that. This does get a little fuzzy when we > >> >> > consider > >> >> > >> >> cases > >> >> > >> >> > > where > >> >> > >> >> > > >>> brokers are on different software versions and could > be > >> >> > >> generating > >> >> > >> >> a > >> >> > >> >> > > >>> downgrade snapshot for version X, but using different > >> >> > versions > >> >> > >> of > >> >> > >> >> the > >> >> > >> >> > > >>> code. > >> >> > >> >> > > >>> It might be safer to let the controller generate the > >> >> > snapshot so > >> >> > >> >> each > >> >> > >> >> > > >>> broker (regardless of software version) gets the same > >> >> > records. > >> >> > >> >> > However, > >> >> > >> >> > > >>> for > >> >> > >> >> > > >>> upgrades (or downgrades) we expect the whole cluster > >> to be > >> >> > >> running > >> >> > >> >> > the > >> >> > >> >> > > >>> same > >> >> > >> >> > > >>> software version before triggering the > metadata.version > >> >> > change, > >> >> > >> so > >> >> > >> >> > > perhaps > >> >> > >> >> > > >>> this isn't a likely scenario. Thoughts? > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> 3. Max metadata version > >> >> > >> >> > > >>> > >For the first release that supports > >> metadata.version, we > >> >> > can > >> >> > >> >> > simply > >> >> > >> >> > > >>> > initialize metadata.version with the current (and > >> only) > >> >> > >> version. > >> >> > >> >> > For > >> >> > >> >> > > >>> future > >> >> > >> >> > > >>> > releases, we will need a mechanism to bootstrap a > >> >> > particular > >> >> > >> >> > version. > >> >> > >> >> > > >>> This > >> >> > >> >> > > >>> > could be done using the meta.properties file or > some > >> >> > similar > >> >> > >> >> > > mechanism. > >> >> > >> >> > > >>> The > >> >> > >> >> > > >>> > reason we need the allow for a specific initial > >> version > >> >> is > >> >> > to > >> >> > >> >> > support > >> >> > >> >> > > >>> the > >> >> > >> >> > > >>> > use case of starting a Kafka cluster at version X > >> with an > >> >> > >> older > >> >> > >> >> > > >>> > metadata.version. > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> I assume that the Active Controller will learn the > >> metadata > >> >> > >> version > >> >> > >> >> > of > >> >> > >> >> > > >>> > the broker through the BrokerRegistrationRequest. > How > >> >> will > >> >> > the > >> >> > >> >> > Active > >> >> > >> >> > > >>> > Controller learn about the max metadata version of > >> the > >> >> > >> inactive > >> >> > >> >> > > >>> > controller nodes? We currently don't send a > >> registration > >> >> > >> request > >> >> > >> >> > from > >> >> > >> >> > > >>> > the inactive controller to the active controller. > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> This came up during the design, but I neglected to > add > >> it > >> >> to > >> >> > the > >> >> > >> >> KIP. > >> >> > >> >> > > We > >> >> > >> >> > > >>> will need a mechanism for determining the supported > >> >> features > >> >> > of > >> >> > >> >> each > >> >> > >> >> > > >>> controller similar to how brokers use > >> >> > BrokerRegistrationRequest. > >> >> > >> >> > > Perhaps > >> >> > >> >> > > >>> controllers could write a FeatureLevelRecord (or > >> similar) > >> >> to > >> >> > the > >> >> > >> >> > > metadata > >> >> > >> >> > > >>> log indicating their supported version. WDYT? > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> Why do you need to bootstrap a particular version? > >> Isn't > >> >> the > >> >> > >> intent > >> >> > >> >> > > >>> > that the broker will learn the active metadata > >> version by > >> >> > >> reading > >> >> > >> >> > the > >> >> > >> >> > > >>> > metadata before unfencing? > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> This bootstrapping is needed for when a KRaft cluster > >> is > >> >> > first > >> >> > >> >> > > started. If > >> >> > >> >> > > >>> we don't have this mechanism, the cluster can't > really > >> do > >> >> > >> anything > >> >> > >> >> > > until > >> >> > >> >> > > >>> the operator finalizes the metadata.version with the > >> tool. > >> >> > The > >> >> > >> >> > > >>> bootstrapping will be done by the controller and the > >> >> brokers > >> >> > >> will > >> >> > >> >> see > >> >> > >> >> > > this > >> >> > >> >> > > >>> version as a record (like you say). I'll add some > text > >> to > >> >> > >> clarify > >> >> > >> >> > this. > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> 4. Reject Registration - This is related to the > bullet > >> >> point > >> >> > >> above. > >> >> > >> >> > > >>> > What will be the behavior of the active controller > >> if the > >> >> > >> broker > >> >> > >> >> > > sends > >> >> > >> >> > > >>> > a metadata version that is not compatible with the > >> >> cluster > >> >> > >> wide > >> >> > >> >> > > >>> > metadata version? > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> If a broker starts up with a lower supported version > >> range > >> >> > than > >> >> > >> the > >> >> > >> >> > > >>> current > >> >> > >> >> > > >>> cluster metadata.version, it should log an error and > >> >> > shutdown. > >> >> > >> This > >> >> > >> >> > is > >> >> > >> >> > > in > >> >> > >> >> > > >>> line with KIP-584. > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> 5. Discover upgrade > >> >> > >> >> > > >>> > > This snapshot will be a convenient way to let > >> broker > >> >> and > >> >> > >> >> > controller > >> >> > >> >> > > >>> > components rebuild their entire in-memory state > >> following > >> >> > an > >> >> > >> >> > upgrade. > >> >> > >> >> > > >>> > Can we rely on the presence of the > >> FeatureLevelRecord for > >> >> > the > >> >> > >> >> > > metadata > >> >> > >> >> > > >>> > version for this functionality? If so, it avoids > >> having > >> >> to > >> >> > >> reload > >> >> > >> >> > the > >> >> > >> >> > > >>> > snapshot. > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> For upgrades, yes probably since we won't need to > >> "rewrite" > >> >> > any > >> >> > >> >> > > records in > >> >> > >> >> > > >>> this case. For downgrades, we will need to generate > the > >> >> > snapshot > >> >> > >> >> and > >> >> > >> >> > > >>> reload > >> >> > >> >> > > >>> everything. > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> 6. Metadata version specification > >> >> > >> >> > > >>> > > V4(version=4, isBackwardsCompatible=false, > >> >> > description="New > >> >> > >> >> > > metadata > >> >> > >> >> > > >>> > record type Bar"), > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> Very cool. Do you have plans to generate Apache Kafka > >> HTML > >> >> > >> >> > > >>> > documentation for this information? Would be > helpful > >> to > >> >> > >> display > >> >> > >> >> > this > >> >> > >> >> > > >>> > information to the user using the kafka-features.sh > >> and > >> >> > >> feature > >> >> > >> >> > RPC? > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> Hm good idea :) I'll add a brief section on > >> documentation. > >> >> > This > >> >> > >> >> would > >> >> > >> >> > > >>> certainly be very useful > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> 7.Downgrade records > >> >> > >> >> > > >>> > I think we should explicitly mention that the > >> downgrade > >> >> > >> process > >> >> > >> >> > will > >> >> > >> >> > > >>> > downgrade both metadata records and controller > >> records. > >> >> In > >> >> > >> >> KIP-630 > >> >> > >> >> > we > >> >> > >> >> > > >>> > introduced two control records for snapshots. > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> Yes, good call. Let me re-read that KIP and include > >> some > >> >> > >> details. > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> Thanks again for the comments! > >> >> > >> >> > > >>> -David > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> On Wed, Sep 29, 2021 at 5:09 PM José Armando García > >> Sancio > >> >> > >> >> > > >>> <jsan...@confluent.io.invalid> wrote: > >> >> > >> >> > > >>> > >> >> > >> >> > > >>> > One more comment. > >> >> > >> >> > > >>> > > >> >> > >> >> > > >>> > 7.Downgrade records > >> >> > >> >> > > >>> > I think we should explicitly mention that the > >> downgrade > >> >> > >> process > >> >> > >> >> > will > >> >> > >> >> > > >>> > downgrade both metadata records and controller > >> records. > >> >> In > >> >> > >> >> KIP-630 > >> >> > >> >> > we > >> >> > >> >> > > >>> > introduced two control records for snapshots. > >> >> > >> >> > > >>> > > >> >> > >> >> > > >>> > >> >> > >> >> > > >> > >> >> > >> >> > > > >> >> > >> >> > > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> -- > >> >> > >> >> -David > >> >> > >> >> > >> >> > >> > >> >> > > >> >> > >> > > > -- -David