Re: [DISCUSS] KIP-778 KRaft Upgrades
Tom, thanks for taking a look! 1. Yes you're right, I'll fix that 2. Components refers to broker objects like ReplicaManager, ClientQuotaManager, etc. Controller components (so far) don't need to reload any state based on metadata.version changes. 3. Yes, if a controller reads a FeatureLevelRecord from a snapshot or commit event, and it's for an unsupportable metadata.version, it will terminate 4. I think it should be impossible for the active controller to read an unsupported metadata.version except for bugs 5. Yes that's possible, however once B sees the new version, it will close its connections to A and force the ApiVersions exchange to happen again. This is an area we're looking to improve on in a future KIP 6. Yes, it should, I'll add. Cheers, David On Fri, Dec 17, 2021 at 12:16 PM Tom Bentley wrote: > Hi David, > > Thanks for the KIP. Sorry I'm so late in looking at this. I have a few > questions. > > 1. In the Proposed Changes Overview it says "The controller validates that > the cluster can be safely downgraded to this version (override with > --force)" I think that be "--unsafe"? > > 2. Also in this section it says "Components reload their state with new > version" the word "components" is a little vague. I think it means > controllers and brokers, right? > > 3. In the compatibility section it says "A controller running old software > will join the quorum and begin replicating the metadata log. If this > inactive controller encounters a FeatureLevelRecord for *metadata.version* > that it cannot support, it should terminate." I assume "encounters" > includes the snapshot case? > > 4. "In the unlikely event that an active controller encounters an > unsupported *metadata.version*, it should resign and terminate. ", I'm > unclear on how this could happen, could you elaborate? > > 5. In the ApiVersions part of the Upgrades section, "Brokers will observe > changes to *metadata.version *as they replicate records from the metadata > log. If a new *metadata.version* is seen, brokers will renegotiate > compatible RPCs with other brokers through the the ApiVersions workflow." > Is it possible that broker A does a metadata fetch, sees a changed > metadata.version and attempts renegotiation with broker B before B has seen > the metadata.version? > > 6. Why does "kafka-features.sh downgrade" not support "--release"? > > Kind regards, > > Tom > > On Tue, Nov 30, 2021 at 10:26 PM Jun Rao wrote: > > > Hi, David, > > > > Thanks for the reply. No more questions from me. > > > > Jun > > > > On Tue, Nov 30, 2021 at 1:52 PM David Arthur > > wrote: > > > > > Thanks Jun, > > > > > > 30: I clarified the description of the "upgrade" command to: > > > > > > The FEATURE and VERSION arguments can be repeated to indicate an > upgrade > > of > > > > multiple features in one request. Alternatively, the RELEASE flag can > > be > > > > given to upgrade to the default versions of the specified release. > > These > > > > two options, FEATURE and RELEASE, are mutually exclusive. If neither > is > > > > given, this command will do nothing. > > > > > > > > > Basically, you must provide either "kafka-features.sh upgrade --release > > > 3.2" or something like "kafka-features.sh upgrade --feature X --version > > 10" > > > > > > -David > > > > > > On Tue, Nov 30, 2021 at 2:51 PM Jun Rao > > wrote: > > > > > > > Hi, David, > > > > > > > > Thanks for the reply. Just one more minor comment. > > > > > > > > 30. ./kafka-features.sh upgrade: It seems that the release param is > > > > optional. Could you describe the semantic when release is not > > specified? > > > > > > > > Jun > > > > > > > > On Mon, Nov 29, 2021 at 5:06 PM David Arthur > > > > wrote: > > > > > > > > > Jun, I updated the KIP with the "disable" CLI. > > > > > > > > > > For 16, I think you're asking how we can safely introduce the > > > > > initial version of other feature flags in the future. This will > > > probably > > > > > depend on the nature of the feature that the new flag is gating. We > > can > > > > > probably take a similar approach and say version 1 is backwards > > > > compatible > > > > > to some point (possibly an IBP or metadata.version?). > > > > > > > > > > -David > > > > > > > > > > On Fri, Nov 19, 2021 at 3:00 PM Jun Rao > > > > wrote: > > > > > > > > > > > Hi, David, > > > > > > > > > > > > Thanks for the reply. The new CLI sounds reasonable to me. > > > > > > > > > > > > 16. > > > > > > For case C, choosing the latest version sounds good to me. > > > > > > For case B, for metadata.version, we pick version 1 since it just > > > > happens > > > > > > that for metadata.version version 1 is backward compatible. How > do > > we > > > > > make > > > > > > this more general for other features? > > > > > > > > > > > > 21. Do you still plan to change "delete" to "disable" in the CLI? > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Jun > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 18, 2021 at 11:50 AM David Arthur > > > > > > wrote:
Re: [DISCUSS] KIP-778 KRaft Upgrades
Hi David, Thanks for the KIP. Sorry I'm so late in looking at this. I have a few questions. 1. In the Proposed Changes Overview it says "The controller validates that the cluster can be safely downgraded to this version (override with --force)" I think that be "--unsafe"? 2. Also in this section it says "Components reload their state with new version" the word "components" is a little vague. I think it means controllers and brokers, right? 3. In the compatibility section it says "A controller running old software will join the quorum and begin replicating the metadata log. If this inactive controller encounters a FeatureLevelRecord for *metadata.version* that it cannot support, it should terminate." I assume "encounters" includes the snapshot case? 4. "In the unlikely event that an active controller encounters an unsupported *metadata.version*, it should resign and terminate. ", I'm unclear on how this could happen, could you elaborate? 5. In the ApiVersions part of the Upgrades section, "Brokers will observe changes to *metadata.version *as they replicate records from the metadata log. If a new *metadata.version* is seen, brokers will renegotiate compatible RPCs with other brokers through the the ApiVersions workflow." Is it possible that broker A does a metadata fetch, sees a changed metadata.version and attempts renegotiation with broker B before B has seen the metadata.version? 6. Why does "kafka-features.sh downgrade" not support "--release"? Kind regards, Tom On Tue, Nov 30, 2021 at 10:26 PM Jun Rao wrote: > Hi, David, > > Thanks for the reply. No more questions from me. > > Jun > > On Tue, Nov 30, 2021 at 1:52 PM David Arthur > wrote: > > > Thanks Jun, > > > > 30: I clarified the description of the "upgrade" command to: > > > > The FEATURE and VERSION arguments can be repeated to indicate an upgrade > of > > > multiple features in one request. Alternatively, the RELEASE flag can > be > > > given to upgrade to the default versions of the specified release. > These > > > two options, FEATURE and RELEASE, are mutually exclusive. If neither is > > > given, this command will do nothing. > > > > > > Basically, you must provide either "kafka-features.sh upgrade --release > > 3.2" or something like "kafka-features.sh upgrade --feature X --version > 10" > > > > -David > > > > On Tue, Nov 30, 2021 at 2:51 PM Jun Rao > wrote: > > > > > Hi, David, > > > > > > Thanks for the reply. Just one more minor comment. > > > > > > 30. ./kafka-features.sh upgrade: It seems that the release param is > > > optional. Could you describe the semantic when release is not > specified? > > > > > > Jun > > > > > > On Mon, Nov 29, 2021 at 5:06 PM David Arthur > > > wrote: > > > > > > > Jun, I updated the KIP with the "disable" CLI. > > > > > > > > For 16, I think you're asking how we can safely introduce the > > > > initial version of other feature flags in the future. This will > > probably > > > > depend on the nature of the feature that the new flag is gating. We > can > > > > probably take a similar approach and say version 1 is backwards > > > compatible > > > > to some point (possibly an IBP or metadata.version?). > > > > > > > > -David > > > > > > > > On Fri, Nov 19, 2021 at 3:00 PM Jun Rao > > > wrote: > > > > > > > > > Hi, David, > > > > > > > > > > Thanks for the reply. The new CLI sounds reasonable to me. > > > > > > > > > > 16. > > > > > For case C, choosing the latest version sounds good to me. > > > > > For case B, for metadata.version, we pick version 1 since it just > > > happens > > > > > that for metadata.version version 1 is backward compatible. How do > we > > > > make > > > > > this more general for other features? > > > > > > > > > > 21. Do you still plan to change "delete" to "disable" in the CLI? > > > > > > > > > > Thanks, > > > > > > > > > > Jun > > > > > > > > > > > > > > > > > > > > On Thu, Nov 18, 2021 at 11:50 AM David Arthur > > > > > wrote: > > > > > > > > > > > Jun, > > > > > > > > > > > > The KIP has some changes to the CLI for KIP-584. With Jason's > > > > suggestion > > > > > > incorporated, these three commands would look like: > > > > > > > > > > > > 1) kafka-features.sh upgrade --release latest > > > > > > upgrades all known features to their defaults in the latest > release > > > > > > > > > > > > 2) kafka-features.sh downgrade --release 3.x > > > > > > downgrade all known features to the default versions from 3.x > > > > > > > > > > > > 3) kafka-features.sh describe --release latest > > > > > > Describe the known features of the latest release > > > > > > > > > > > > The --upgrade-all and --downgrade-all are replaced by specific > each > > > > > > feature+version or giving the --release argument. One problem > with > > > > > > --downgrade-all is it's not clear what the target versions should > > be: > > > > the > > > > > > previous version before the last upgrade -- or the lowest > supported > > > > > > version. Since downgrades will be less common, I think it's > better > >
Re: [DISCUSS] KIP-778 KRaft Upgrades
Hi, David, Thanks for the reply. No more questions from me. Jun On Tue, Nov 30, 2021 at 1:52 PM David Arthur wrote: > Thanks Jun, > > 30: I clarified the description of the "upgrade" command to: > > The FEATURE and VERSION arguments can be repeated to indicate an upgrade of > > multiple features in one request. Alternatively, the RELEASE flag can be > > given to upgrade to the default versions of the specified release. These > > two options, FEATURE and RELEASE, are mutually exclusive. If neither is > > given, this command will do nothing. > > > Basically, you must provide either "kafka-features.sh upgrade --release > 3.2" or something like "kafka-features.sh upgrade --feature X --version 10" > > -David > > On Tue, Nov 30, 2021 at 2:51 PM Jun Rao wrote: > > > Hi, David, > > > > Thanks for the reply. Just one more minor comment. > > > > 30. ./kafka-features.sh upgrade: It seems that the release param is > > optional. Could you describe the semantic when release is not specified? > > > > Jun > > > > On Mon, Nov 29, 2021 at 5:06 PM David Arthur > > wrote: > > > > > Jun, I updated the KIP with the "disable" CLI. > > > > > > For 16, I think you're asking how we can safely introduce the > > > initial version of other feature flags in the future. This will > probably > > > depend on the nature of the feature that the new flag is gating. We can > > > probably take a similar approach and say version 1 is backwards > > compatible > > > to some point (possibly an IBP or metadata.version?). > > > > > > -David > > > > > > On Fri, Nov 19, 2021 at 3:00 PM Jun Rao > > wrote: > > > > > > > Hi, David, > > > > > > > > Thanks for the reply. The new CLI sounds reasonable to me. > > > > > > > > 16. > > > > For case C, choosing the latest version sounds good to me. > > > > For case B, for metadata.version, we pick version 1 since it just > > happens > > > > that for metadata.version version 1 is backward compatible. How do we > > > make > > > > this more general for other features? > > > > > > > > 21. Do you still plan to change "delete" to "disable" in the CLI? > > > > > > > > Thanks, > > > > > > > > Jun > > > > > > > > > > > > > > > > On Thu, Nov 18, 2021 at 11:50 AM David Arthur > > > > wrote: > > > > > > > > > Jun, > > > > > > > > > > The KIP has some changes to the CLI for KIP-584. With Jason's > > > suggestion > > > > > incorporated, these three commands would look like: > > > > > > > > > > 1) kafka-features.sh upgrade --release latest > > > > > upgrades all known features to their defaults in the latest release > > > > > > > > > > 2) kafka-features.sh downgrade --release 3.x > > > > > downgrade all known features to the default versions from 3.x > > > > > > > > > > 3) kafka-features.sh describe --release latest > > > > > Describe the known features of the latest release > > > > > > > > > > The --upgrade-all and --downgrade-all are replaced by specific each > > > > > feature+version or giving the --release argument. One problem with > > > > > --downgrade-all is it's not clear what the target versions should > be: > > > the > > > > > previous version before the last upgrade -- or the lowest supported > > > > > version. Since downgrades will be less common, I think it's better > to > > > > make > > > > > the operator be more explicit about the desired downgrade version > > > (either > > > > > through [--feature X --version Y] or [--release 3.1]). Does that > seem > > > > > reasonable? > > > > > > > > > > 16. For case C, I think we will want to always use the latest > > > > > metadata.version for new clusters (like we do for IBP). We can > always > > > > > change what we mean by "default" down the road. Also, this will > > likely > > > > > become a bit of information we include in release and upgrade notes > > > with > > > > > each release. > > > > > > > > > > -David > > > > > > > > > > On Thu, Nov 18, 2021 at 1:14 PM Jun Rao > > > > wrote: > > > > > > > > > > > Hi, Jason, David, > > > > > > > > > > > > Just to clarify on the interaction with the end user, the design > in > > > > > KIP-584 > > > > > > allows one to do the following. > > > > > > (1) kafka-features.sh --upgrade-all --bootstrap-server > > > > > > kafka-broker0.prn1:9071 to upgrade all features to the latest > > version > > > > > known > > > > > > by the tool. The user doesn't need to know a specific feature > > > version. > > > > > > (2) kafka-features.sh --downgrade-all --bootstrap-server > > > > > > kafka-broker0.prn1:9071 to downgrade all features to the latest > > > version > > > > > > known by the tool. The user doesn't need to know a specific > feature > > > > > > version. > > > > > > (3) kafka-features.sh --describe --bootstrap-server > > > > > > kafka-broker0.prn1:9071 to find out the supported version for > each > > > > > feature. > > > > > > This allows the user to upgrade each feature individually. > > > > > > > > > > > > Most users will be doing (1) and occasionally (2), and won't need > > to > > > > know > > > > > > the exact version of
Re: [DISCUSS] KIP-778 KRaft Upgrades
Thanks Jun, 30: I clarified the description of the "upgrade" command to: The FEATURE and VERSION arguments can be repeated to indicate an upgrade of > multiple features in one request. Alternatively, the RELEASE flag can be > given to upgrade to the default versions of the specified release. These > two options, FEATURE and RELEASE, are mutually exclusive. If neither is > given, this command will do nothing. Basically, you must provide either "kafka-features.sh upgrade --release 3.2" or something like "kafka-features.sh upgrade --feature X --version 10" -David On Tue, Nov 30, 2021 at 2:51 PM Jun Rao wrote: > Hi, David, > > Thanks for the reply. Just one more minor comment. > > 30. ./kafka-features.sh upgrade: It seems that the release param is > optional. Could you describe the semantic when release is not specified? > > Jun > > On Mon, Nov 29, 2021 at 5:06 PM David Arthur > wrote: > > > Jun, I updated the KIP with the "disable" CLI. > > > > For 16, I think you're asking how we can safely introduce the > > initial version of other feature flags in the future. This will probably > > depend on the nature of the feature that the new flag is gating. We can > > probably take a similar approach and say version 1 is backwards > compatible > > to some point (possibly an IBP or metadata.version?). > > > > -David > > > > On Fri, Nov 19, 2021 at 3:00 PM Jun Rao > wrote: > > > > > Hi, David, > > > > > > Thanks for the reply. The new CLI sounds reasonable to me. > > > > > > 16. > > > For case C, choosing the latest version sounds good to me. > > > For case B, for metadata.version, we pick version 1 since it just > happens > > > that for metadata.version version 1 is backward compatible. How do we > > make > > > this more general for other features? > > > > > > 21. Do you still plan to change "delete" to "disable" in the CLI? > > > > > > Thanks, > > > > > > Jun > > > > > > > > > > > > On Thu, Nov 18, 2021 at 11:50 AM David Arthur > > > wrote: > > > > > > > Jun, > > > > > > > > The KIP has some changes to the CLI for KIP-584. With Jason's > > suggestion > > > > incorporated, these three commands would look like: > > > > > > > > 1) kafka-features.sh upgrade --release latest > > > > upgrades all known features to their defaults in the latest release > > > > > > > > 2) kafka-features.sh downgrade --release 3.x > > > > downgrade all known features to the default versions from 3.x > > > > > > > > 3) kafka-features.sh describe --release latest > > > > Describe the known features of the latest release > > > > > > > > The --upgrade-all and --downgrade-all are replaced by specific each > > > > feature+version or giving the --release argument. One problem with > > > > --downgrade-all is it's not clear what the target versions should be: > > the > > > > previous version before the last upgrade -- or the lowest supported > > > > version. Since downgrades will be less common, I think it's better to > > > make > > > > the operator be more explicit about the desired downgrade version > > (either > > > > through [--feature X --version Y] or [--release 3.1]). Does that seem > > > > reasonable? > > > > > > > > 16. For case C, I think we will want to always use the latest > > > > metadata.version for new clusters (like we do for IBP). We can always > > > > change what we mean by "default" down the road. Also, this will > likely > > > > become a bit of information we include in release and upgrade notes > > with > > > > each release. > > > > > > > > -David > > > > > > > > On Thu, Nov 18, 2021 at 1:14 PM Jun Rao > > > wrote: > > > > > > > > > Hi, Jason, David, > > > > > > > > > > Just to clarify on the interaction with the end user, the design in > > > > KIP-584 > > > > > allows one to do the following. > > > > > (1) kafka-features.sh --upgrade-all --bootstrap-server > > > > > kafka-broker0.prn1:9071 to upgrade all features to the latest > version > > > > known > > > > > by the tool. The user doesn't need to know a specific feature > > version. > > > > > (2) kafka-features.sh --downgrade-all --bootstrap-server > > > > > kafka-broker0.prn1:9071 to downgrade all features to the latest > > version > > > > > known by the tool. The user doesn't need to know a specific feature > > > > > version. > > > > > (3) kafka-features.sh --describe --bootstrap-server > > > > > kafka-broker0.prn1:9071 to find out the supported version for each > > > > feature. > > > > > This allows the user to upgrade each feature individually. > > > > > > > > > > Most users will be doing (1) and occasionally (2), and won't need > to > > > know > > > > > the exact version of each feature. > > > > > > > > > > 16. For case C, what's the default version? Is it always the > latest? > > > > > > > > > > Thanks, > > > > > > > > > > Jun > > > > > > > > > > > > > > > On Thu, Nov 18, 2021 at 8:15 AM David Arthur > > > > > wrote: > > > > > > > > > > > Colin, thanks for the detailed response. I understand what you're > > > > saying > > > > > > and I agree with your ration
Re: [DISCUSS] KIP-778 KRaft Upgrades
Hi, David, Thanks for the reply. Just one more minor comment. 30. ./kafka-features.sh upgrade: It seems that the release param is optional. Could you describe the semantic when release is not specified? Jun On Mon, Nov 29, 2021 at 5:06 PM David Arthur wrote: > Jun, I updated the KIP with the "disable" CLI. > > For 16, I think you're asking how we can safely introduce the > initial version of other feature flags in the future. This will probably > depend on the nature of the feature that the new flag is gating. We can > probably take a similar approach and say version 1 is backwards compatible > to some point (possibly an IBP or metadata.version?). > > -David > > On Fri, Nov 19, 2021 at 3:00 PM Jun Rao wrote: > > > Hi, David, > > > > Thanks for the reply. The new CLI sounds reasonable to me. > > > > 16. > > For case C, choosing the latest version sounds good to me. > > For case B, for metadata.version, we pick version 1 since it just happens > > that for metadata.version version 1 is backward compatible. How do we > make > > this more general for other features? > > > > 21. Do you still plan to change "delete" to "disable" in the CLI? > > > > Thanks, > > > > Jun > > > > > > > > On Thu, Nov 18, 2021 at 11:50 AM David Arthur > > wrote: > > > > > Jun, > > > > > > The KIP has some changes to the CLI for KIP-584. With Jason's > suggestion > > > incorporated, these three commands would look like: > > > > > > 1) kafka-features.sh upgrade --release latest > > > upgrades all known features to their defaults in the latest release > > > > > > 2) kafka-features.sh downgrade --release 3.x > > > downgrade all known features to the default versions from 3.x > > > > > > 3) kafka-features.sh describe --release latest > > > Describe the known features of the latest release > > > > > > The --upgrade-all and --downgrade-all are replaced by specific each > > > feature+version or giving the --release argument. One problem with > > > --downgrade-all is it's not clear what the target versions should be: > the > > > previous version before the last upgrade -- or the lowest supported > > > version. Since downgrades will be less common, I think it's better to > > make > > > the operator be more explicit about the desired downgrade version > (either > > > through [--feature X --version Y] or [--release 3.1]). Does that seem > > > reasonable? > > > > > > 16. For case C, I think we will want to always use the latest > > > metadata.version for new clusters (like we do for IBP). We can always > > > change what we mean by "default" down the road. Also, this will likely > > > become a bit of information we include in release and upgrade notes > with > > > each release. > > > > > > -David > > > > > > On Thu, Nov 18, 2021 at 1:14 PM Jun Rao > > wrote: > > > > > > > Hi, Jason, David, > > > > > > > > Just to clarify on the interaction with the end user, the design in > > > KIP-584 > > > > allows one to do the following. > > > > (1) kafka-features.sh --upgrade-all --bootstrap-server > > > > kafka-broker0.prn1:9071 to upgrade all features to the latest version > > > known > > > > by the tool. The user doesn't need to know a specific feature > version. > > > > (2) kafka-features.sh --downgrade-all --bootstrap-server > > > > kafka-broker0.prn1:9071 to downgrade all features to the latest > version > > > > known by the tool. The user doesn't need to know a specific feature > > > > version. > > > > (3) kafka-features.sh --describe --bootstrap-server > > > > kafka-broker0.prn1:9071 to find out the supported version for each > > > feature. > > > > This allows the user to upgrade each feature individually. > > > > > > > > Most users will be doing (1) and occasionally (2), and won't need to > > know > > > > the exact version of each feature. > > > > > > > > 16. For case C, what's the default version? Is it always the latest? > > > > > > > > Thanks, > > > > > > > > Jun > > > > > > > > > > > > On Thu, Nov 18, 2021 at 8:15 AM David Arthur > > > > wrote: > > > > > > > > > Colin, thanks for the detailed response. I understand what you're > > > saying > > > > > and I agree with your rationale. > > > > > > > > > > It seems like we could just initialize their cluster.metadata to 1 > > when > > > > the > > > > > > software is upgraded to 3.2. > > > > > > > > > > > > > > > > Concretely, this means the controller would see that there is no > > > > > FeatureLevelRecord in the log, and so it would bootstrap one. > > Normally > > > > for > > > > > cluster initialization, this would be read from meta.properties, > but > > in > > > > the > > > > > case of preview clusters we would need to special case it to > > initialize > > > > the > > > > > version to 1. > > > > > > > > > > Once the new FeatureLevelRecord has been committed, any nodes > > (brokers > > > or > > > > > controllers) that are running the latest software will react to the > > new > > > > > metadata.version. This means we will need to make this initial > > version > > > > of 1 > > > > > be back
Re: [DISCUSS] KIP-778 KRaft Upgrades
Jun, I updated the KIP with the "disable" CLI. For 16, I think you're asking how we can safely introduce the initial version of other feature flags in the future. This will probably depend on the nature of the feature that the new flag is gating. We can probably take a similar approach and say version 1 is backwards compatible to some point (possibly an IBP or metadata.version?). -David On Fri, Nov 19, 2021 at 3:00 PM Jun Rao wrote: > Hi, David, > > Thanks for the reply. The new CLI sounds reasonable to me. > > 16. > For case C, choosing the latest version sounds good to me. > For case B, for metadata.version, we pick version 1 since it just happens > that for metadata.version version 1 is backward compatible. How do we make > this more general for other features? > > 21. Do you still plan to change "delete" to "disable" in the CLI? > > Thanks, > > Jun > > > > On Thu, Nov 18, 2021 at 11:50 AM David Arthur > wrote: > > > Jun, > > > > The KIP has some changes to the CLI for KIP-584. With Jason's suggestion > > incorporated, these three commands would look like: > > > > 1) kafka-features.sh upgrade --release latest > > upgrades all known features to their defaults in the latest release > > > > 2) kafka-features.sh downgrade --release 3.x > > downgrade all known features to the default versions from 3.x > > > > 3) kafka-features.sh describe --release latest > > Describe the known features of the latest release > > > > The --upgrade-all and --downgrade-all are replaced by specific each > > feature+version or giving the --release argument. One problem with > > --downgrade-all is it's not clear what the target versions should be: the > > previous version before the last upgrade -- or the lowest supported > > version. Since downgrades will be less common, I think it's better to > make > > the operator be more explicit about the desired downgrade version (either > > through [--feature X --version Y] or [--release 3.1]). Does that seem > > reasonable? > > > > 16. For case C, I think we will want to always use the latest > > metadata.version for new clusters (like we do for IBP). We can always > > change what we mean by "default" down the road. Also, this will likely > > become a bit of information we include in release and upgrade notes with > > each release. > > > > -David > > > > On Thu, Nov 18, 2021 at 1:14 PM Jun Rao > wrote: > > > > > Hi, Jason, David, > > > > > > Just to clarify on the interaction with the end user, the design in > > KIP-584 > > > allows one to do the following. > > > (1) kafka-features.sh --upgrade-all --bootstrap-server > > > kafka-broker0.prn1:9071 to upgrade all features to the latest version > > known > > > by the tool. The user doesn't need to know a specific feature version. > > > (2) kafka-features.sh --downgrade-all --bootstrap-server > > > kafka-broker0.prn1:9071 to downgrade all features to the latest version > > > known by the tool. The user doesn't need to know a specific feature > > > version. > > > (3) kafka-features.sh --describe --bootstrap-server > > > kafka-broker0.prn1:9071 to find out the supported version for each > > feature. > > > This allows the user to upgrade each feature individually. > > > > > > Most users will be doing (1) and occasionally (2), and won't need to > know > > > the exact version of each feature. > > > > > > 16. For case C, what's the default version? Is it always the latest? > > > > > > Thanks, > > > > > > Jun > > > > > > > > > On Thu, Nov 18, 2021 at 8:15 AM David Arthur > > > wrote: > > > > > > > Colin, thanks for the detailed response. I understand what you're > > saying > > > > and I agree with your rationale. > > > > > > > > It seems like we could just initialize their cluster.metadata to 1 > when > > > the > > > > > software is upgraded to 3.2. > > > > > > > > > > > > > Concretely, this means the controller would see that there is no > > > > FeatureLevelRecord in the log, and so it would bootstrap one. > Normally > > > for > > > > cluster initialization, this would be read from meta.properties, but > in > > > the > > > > case of preview clusters we would need to special case it to > initialize > > > the > > > > version to 1. > > > > > > > > Once the new FeatureLevelRecord has been committed, any nodes > (brokers > > or > > > > controllers) that are running the latest software will react to the > new > > > > metadata.version. This means we will need to make this initial > version > > > of 1 > > > > be backwards compatible to what we have in 3.0 and 3.1 (since some > > > brokers > > > > will be on the new software and some on older/preview versions) > > > > > > > > I agree that auto-upgrading preview users from IBP 3.0 to > > > metadata.version > > > > 1 (equivalent to IBP 3.2) is probably fine. > > > > > > > > Back to Jun's case B: > > > > > > > > b. We upgrade from an old version where no metadata.version has been > > > > > finalized. In this case, it makes sense to leave metadata.version > > > > disabled > > > > > since we don't kno
Re: [DISCUSS] KIP-778 KRaft Upgrades
Hi, David, Thanks for the reply. The new CLI sounds reasonable to me. 16. For case C, choosing the latest version sounds good to me. For case B, for metadata.version, we pick version 1 since it just happens that for metadata.version version 1 is backward compatible. How do we make this more general for other features? 21. Do you still plan to change "delete" to "disable" in the CLI? Thanks, Jun On Thu, Nov 18, 2021 at 11:50 AM David Arthur wrote: > Jun, > > The KIP has some changes to the CLI for KIP-584. With Jason's suggestion > incorporated, these three commands would look like: > > 1) kafka-features.sh upgrade --release latest > upgrades all known features to their defaults in the latest release > > 2) kafka-features.sh downgrade --release 3.x > downgrade all known features to the default versions from 3.x > > 3) kafka-features.sh describe --release latest > Describe the known features of the latest release > > The --upgrade-all and --downgrade-all are replaced by specific each > feature+version or giving the --release argument. One problem with > --downgrade-all is it's not clear what the target versions should be: the > previous version before the last upgrade -- or the lowest supported > version. Since downgrades will be less common, I think it's better to make > the operator be more explicit about the desired downgrade version (either > through [--feature X --version Y] or [--release 3.1]). Does that seem > reasonable? > > 16. For case C, I think we will want to always use the latest > metadata.version for new clusters (like we do for IBP). We can always > change what we mean by "default" down the road. Also, this will likely > become a bit of information we include in release and upgrade notes with > each release. > > -David > > On Thu, Nov 18, 2021 at 1:14 PM Jun Rao wrote: > > > Hi, Jason, David, > > > > Just to clarify on the interaction with the end user, the design in > KIP-584 > > allows one to do the following. > > (1) kafka-features.sh --upgrade-all --bootstrap-server > > kafka-broker0.prn1:9071 to upgrade all features to the latest version > known > > by the tool. The user doesn't need to know a specific feature version. > > (2) kafka-features.sh --downgrade-all --bootstrap-server > > kafka-broker0.prn1:9071 to downgrade all features to the latest version > > known by the tool. The user doesn't need to know a specific feature > > version. > > (3) kafka-features.sh --describe --bootstrap-server > > kafka-broker0.prn1:9071 to find out the supported version for each > feature. > > This allows the user to upgrade each feature individually. > > > > Most users will be doing (1) and occasionally (2), and won't need to know > > the exact version of each feature. > > > > 16. For case C, what's the default version? Is it always the latest? > > > > Thanks, > > > > Jun > > > > > > On Thu, Nov 18, 2021 at 8:15 AM David Arthur > > wrote: > > > > > Colin, thanks for the detailed response. I understand what you're > saying > > > and I agree with your rationale. > > > > > > It seems like we could just initialize their cluster.metadata to 1 when > > the > > > > software is upgraded to 3.2. > > > > > > > > > > Concretely, this means the controller would see that there is no > > > FeatureLevelRecord in the log, and so it would bootstrap one. Normally > > for > > > cluster initialization, this would be read from meta.properties, but in > > the > > > case of preview clusters we would need to special case it to initialize > > the > > > version to 1. > > > > > > Once the new FeatureLevelRecord has been committed, any nodes (brokers > or > > > controllers) that are running the latest software will react to the new > > > metadata.version. This means we will need to make this initial version > > of 1 > > > be backwards compatible to what we have in 3.0 and 3.1 (since some > > brokers > > > will be on the new software and some on older/preview versions) > > > > > > I agree that auto-upgrading preview users from IBP 3.0 to > > metadata.version > > > 1 (equivalent to IBP 3.2) is probably fine. > > > > > > Back to Jun's case B: > > > > > > b. We upgrade from an old version where no metadata.version has been > > > > finalized. In this case, it makes sense to leave metadata.version > > > disabled > > > > since we don't know if all brokers have been upgraded. > > > > > > > > > Instead of leaving it uninitialized, we would initialize it to 1 which > > > would be backwards compatible to 3.0 and 3.1. This would eliminate the > > need > > > to check a "final IBP" or deal with 3.2+ clusters without a > > > metadata.version set. The downgrade path for 3.2 back to a preview > > release > > > should be fine since we are saying that metadata.version 1 is > compatible > > > with those releases. The FeatureLevelRecord would exist, but it would > be > > > ignored (we might need to make sure this actually works in 3.0). > > > > > > For clarity, here are the three workflows of Jun's three cases: > > > > > > A (KRaft 3.2+
Re: [DISCUSS] KIP-778 KRaft Upgrades
Jun, The KIP has some changes to the CLI for KIP-584. With Jason's suggestion incorporated, these three commands would look like: 1) kafka-features.sh upgrade --release latest upgrades all known features to their defaults in the latest release 2) kafka-features.sh downgrade --release 3.x downgrade all known features to the default versions from 3.x 3) kafka-features.sh describe --release latest Describe the known features of the latest release The --upgrade-all and --downgrade-all are replaced by specific each feature+version or giving the --release argument. One problem with --downgrade-all is it's not clear what the target versions should be: the previous version before the last upgrade -- or the lowest supported version. Since downgrades will be less common, I think it's better to make the operator be more explicit about the desired downgrade version (either through [--feature X --version Y] or [--release 3.1]). Does that seem reasonable? 16. For case C, I think we will want to always use the latest metadata.version for new clusters (like we do for IBP). We can always change what we mean by "default" down the road. Also, this will likely become a bit of information we include in release and upgrade notes with each release. -David On Thu, Nov 18, 2021 at 1:14 PM Jun Rao wrote: > Hi, Jason, David, > > Just to clarify on the interaction with the end user, the design in KIP-584 > allows one to do the following. > (1) kafka-features.sh --upgrade-all --bootstrap-server > kafka-broker0.prn1:9071 to upgrade all features to the latest version known > by the tool. The user doesn't need to know a specific feature version. > (2) kafka-features.sh --downgrade-all --bootstrap-server > kafka-broker0.prn1:9071 to downgrade all features to the latest version > known by the tool. The user doesn't need to know a specific feature > version. > (3) kafka-features.sh --describe --bootstrap-server > kafka-broker0.prn1:9071 to find out the supported version for each feature. > This allows the user to upgrade each feature individually. > > Most users will be doing (1) and occasionally (2), and won't need to know > the exact version of each feature. > > 16. For case C, what's the default version? Is it always the latest? > > Thanks, > > Jun > > > On Thu, Nov 18, 2021 at 8:15 AM David Arthur > wrote: > > > Colin, thanks for the detailed response. I understand what you're saying > > and I agree with your rationale. > > > > It seems like we could just initialize their cluster.metadata to 1 when > the > > > software is upgraded to 3.2. > > > > > > > Concretely, this means the controller would see that there is no > > FeatureLevelRecord in the log, and so it would bootstrap one. Normally > for > > cluster initialization, this would be read from meta.properties, but in > the > > case of preview clusters we would need to special case it to initialize > the > > version to 1. > > > > Once the new FeatureLevelRecord has been committed, any nodes (brokers or > > controllers) that are running the latest software will react to the new > > metadata.version. This means we will need to make this initial version > of 1 > > be backwards compatible to what we have in 3.0 and 3.1 (since some > brokers > > will be on the new software and some on older/preview versions) > > > > I agree that auto-upgrading preview users from IBP 3.0 to > metadata.version > > 1 (equivalent to IBP 3.2) is probably fine. > > > > Back to Jun's case B: > > > > b. We upgrade from an old version where no metadata.version has been > > > finalized. In this case, it makes sense to leave metadata.version > > disabled > > > since we don't know if all brokers have been upgraded. > > > > > > Instead of leaving it uninitialized, we would initialize it to 1 which > > would be backwards compatible to 3.0 and 3.1. This would eliminate the > need > > to check a "final IBP" or deal with 3.2+ clusters without a > > metadata.version set. The downgrade path for 3.2 back to a preview > release > > should be fine since we are saying that metadata.version 1 is compatible > > with those releases. The FeatureLevelRecord would exist, but it would be > > ignored (we might need to make sure this actually works in 3.0). > > > > For clarity, here are the three workflows of Jun's three cases: > > > > A (KRaft 3.2+ to KRaft 3.2+): > > * User initializes cluster with an explicit metadata.version X, or the > tool > > selects the default > > * After upgrade, cluster stays at version X until operator upgrades to Y > > > > B (KRaft Preview to KRaft 3.2+): > > * User initializes cluster without metadata.version > > * After controller leader is upgraded, initializes metadata.version to 1 > > * After the cluster is upgraded, an operator may further upgrade to > version > > Y > > > > C (New KRaft 3.2+): > > * User initializes cluster with an explicit metadata.version X, or the > tool > > selects the default > > > > > > This has been mentioned in this thread, but we should explicitly call out > > th
Re: [DISCUSS] KIP-778 KRaft Upgrades
Hi, Jason, David, Just to clarify on the interaction with the end user, the design in KIP-584 allows one to do the following. (1) kafka-features.sh --upgrade-all --bootstrap-server kafka-broker0.prn1:9071 to upgrade all features to the latest version known by the tool. The user doesn't need to know a specific feature version. (2) kafka-features.sh --downgrade-all --bootstrap-server kafka-broker0.prn1:9071 to downgrade all features to the latest version known by the tool. The user doesn't need to know a specific feature version. (3) kafka-features.sh --describe --bootstrap-server kafka-broker0.prn1:9071 to find out the supported version for each feature. This allows the user to upgrade each feature individually. Most users will be doing (1) and occasionally (2), and won't need to know the exact version of each feature. 16. For case C, what's the default version? Is it always the latest? Thanks, Jun On Thu, Nov 18, 2021 at 8:15 AM David Arthur wrote: > Colin, thanks for the detailed response. I understand what you're saying > and I agree with your rationale. > > It seems like we could just initialize their cluster.metadata to 1 when the > > software is upgraded to 3.2. > > > > Concretely, this means the controller would see that there is no > FeatureLevelRecord in the log, and so it would bootstrap one. Normally for > cluster initialization, this would be read from meta.properties, but in the > case of preview clusters we would need to special case it to initialize the > version to 1. > > Once the new FeatureLevelRecord has been committed, any nodes (brokers or > controllers) that are running the latest software will react to the new > metadata.version. This means we will need to make this initial version of 1 > be backwards compatible to what we have in 3.0 and 3.1 (since some brokers > will be on the new software and some on older/preview versions) > > I agree that auto-upgrading preview users from IBP 3.0 to metadata.version > 1 (equivalent to IBP 3.2) is probably fine. > > Back to Jun's case B: > > b. We upgrade from an old version where no metadata.version has been > > finalized. In this case, it makes sense to leave metadata.version > disabled > > since we don't know if all brokers have been upgraded. > > > Instead of leaving it uninitialized, we would initialize it to 1 which > would be backwards compatible to 3.0 and 3.1. This would eliminate the need > to check a "final IBP" or deal with 3.2+ clusters without a > metadata.version set. The downgrade path for 3.2 back to a preview release > should be fine since we are saying that metadata.version 1 is compatible > with those releases. The FeatureLevelRecord would exist, but it would be > ignored (we might need to make sure this actually works in 3.0). > > For clarity, here are the three workflows of Jun's three cases: > > A (KRaft 3.2+ to KRaft 3.2+): > * User initializes cluster with an explicit metadata.version X, or the tool > selects the default > * After upgrade, cluster stays at version X until operator upgrades to Y > > B (KRaft Preview to KRaft 3.2+): > * User initializes cluster without metadata.version > * After controller leader is upgraded, initializes metadata.version to 1 > * After the cluster is upgraded, an operator may further upgrade to version > Y > > C (New KRaft 3.2+): > * User initializes cluster with an explicit metadata.version X, or the tool > selects the default > > > This has been mentioned in this thread, but we should explicitly call out > that the absence of metadata.version in meta.properties will be used > to identify that we are in case B. After 3.2, we will require that > metadata.version is present in meta.properties for new clusters. If users > omit the --metadata-version flag from kafka-storage.sh, we should fill in a > default. > > So how about the following changes/clarifications to the KIP: > > * Starting with 3.2, metadata.version is required in KRaft, IBP is ignored > * If meta.properties does not include metadata.version, it indicates this > cluster was initialized with a preview release > * If upgrading from a preview release to 3.2+, the controller will > initialize metadata.version=1 > * metadata.version=1 is backwards compatible with preview releases > > -David > > > On Thu, Nov 18, 2021 at 10:02 AM David Arthur > wrote: > > > Jason, > > > > 1/2. You've got it right. The intention is that metadata.version will > gate > > both RPCs (like IBP does) as well as metadata records. So, when a broker > > sees that metadata.version changed, it may start advertising new RPCs and > > it will need to refresh the ApiVersions it has for other brokers. I can > try > > to make this more explicit in the KIP. > > > > 3. Yes, that's the intention of the --dry-run flag. The current > > implementation in kafka-features.sh does a dry run on the client side, > but > > this KIP pushes the validation down to the controller. This will allow us > > to have the context needed to do proper validation > > > > Re: version
Re: [DISCUSS] KIP-778 KRaft Upgrades
Colin, thanks for the detailed response. I understand what you're saying and I agree with your rationale. It seems like we could just initialize their cluster.metadata to 1 when the > software is upgraded to 3.2. > Concretely, this means the controller would see that there is no FeatureLevelRecord in the log, and so it would bootstrap one. Normally for cluster initialization, this would be read from meta.properties, but in the case of preview clusters we would need to special case it to initialize the version to 1. Once the new FeatureLevelRecord has been committed, any nodes (brokers or controllers) that are running the latest software will react to the new metadata.version. This means we will need to make this initial version of 1 be backwards compatible to what we have in 3.0 and 3.1 (since some brokers will be on the new software and some on older/preview versions) I agree that auto-upgrading preview users from IBP 3.0 to metadata.version 1 (equivalent to IBP 3.2) is probably fine. Back to Jun's case B: b. We upgrade from an old version where no metadata.version has been > finalized. In this case, it makes sense to leave metadata.version disabled > since we don't know if all brokers have been upgraded. Instead of leaving it uninitialized, we would initialize it to 1 which would be backwards compatible to 3.0 and 3.1. This would eliminate the need to check a "final IBP" or deal with 3.2+ clusters without a metadata.version set. The downgrade path for 3.2 back to a preview release should be fine since we are saying that metadata.version 1 is compatible with those releases. The FeatureLevelRecord would exist, but it would be ignored (we might need to make sure this actually works in 3.0). For clarity, here are the three workflows of Jun's three cases: A (KRaft 3.2+ to KRaft 3.2+): * User initializes cluster with an explicit metadata.version X, or the tool selects the default * After upgrade, cluster stays at version X until operator upgrades to Y B (KRaft Preview to KRaft 3.2+): * User initializes cluster without metadata.version * After controller leader is upgraded, initializes metadata.version to 1 * After the cluster is upgraded, an operator may further upgrade to version Y C (New KRaft 3.2+): * User initializes cluster with an explicit metadata.version X, or the tool selects the default This has been mentioned in this thread, but we should explicitly call out that the absence of metadata.version in meta.properties will be used to identify that we are in case B. After 3.2, we will require that metadata.version is present in meta.properties for new clusters. If users omit the --metadata-version flag from kafka-storage.sh, we should fill in a default. So how about the following changes/clarifications to the KIP: * Starting with 3.2, metadata.version is required in KRaft, IBP is ignored * If meta.properties does not include metadata.version, it indicates this cluster was initialized with a preview release * If upgrading from a preview release to 3.2+, the controller will initialize metadata.version=1 * metadata.version=1 is backwards compatible with preview releases -David On Thu, Nov 18, 2021 at 10:02 AM David Arthur wrote: > Jason, > > 1/2. You've got it right. The intention is that metadata.version will gate > both RPCs (like IBP does) as well as metadata records. So, when a broker > sees that metadata.version changed, it may start advertising new RPCs and > it will need to refresh the ApiVersions it has for other brokers. I can try > to make this more explicit in the KIP. > > 3. Yes, that's the intention of the --dry-run flag. The current > implementation in kafka-features.sh does a dry run on the client side, but > this KIP pushes the validation down to the controller. This will allow us > to have the context needed to do proper validation > > Re: version number complexity -- yes this has come up in offline > discussions. With just one feature flag to manage, it's not so bad, but > explicitly managing even a few would be a burden. I like your suggestion of > the "--release" flag. That could act as a sort of manifest of versions > (i.e., the defaults from that release). We could also support something > like "kafka-features upgrade --release latest" to bring everything to the > highest supported version. > > > > On Wed, Nov 17, 2021 at 9:36 PM Jason Gustafson > wrote: > >> Hi Colin/David, >> >> > Like David said, basically it boils down to creating a feature flag for >> the new proposed __consumer_offsets version, or using a new >> IBP/metadata.version for it. Both approaches have pros and cons. Using an >> IBP/metadata.version bump reduces the size of the testing matrix. But >> using >> a feature flag allows people to avoid any bugs or pain associated with the >> change if they don't care about enabling it. This is basically the classic >> "should I use a feature flag or not?" discussion and we need to have it on >> a case-by-case basis. >> >> I think most users are not going t
Re: [DISCUSS] KIP-778 KRaft Upgrades
Jason, 1/2. You've got it right. The intention is that metadata.version will gate both RPCs (like IBP does) as well as metadata records. So, when a broker sees that metadata.version changed, it may start advertising new RPCs and it will need to refresh the ApiVersions it has for other brokers. I can try to make this more explicit in the KIP. 3. Yes, that's the intention of the --dry-run flag. The current implementation in kafka-features.sh does a dry run on the client side, but this KIP pushes the validation down to the controller. This will allow us to have the context needed to do proper validation Re: version number complexity -- yes this has come up in offline discussions. With just one feature flag to manage, it's not so bad, but explicitly managing even a few would be a burden. I like your suggestion of the "--release" flag. That could act as a sort of manifest of versions (i.e., the defaults from that release). We could also support something like "kafka-features upgrade --release latest" to bring everything to the highest supported version. On Wed, Nov 17, 2021 at 9:36 PM Jason Gustafson wrote: > Hi Colin/David, > > > Like David said, basically it boils down to creating a feature flag for > the new proposed __consumer_offsets version, or using a new > IBP/metadata.version for it. Both approaches have pros and cons. Using an > IBP/metadata.version bump reduces the size of the testing matrix. But using > a feature flag allows people to avoid any bugs or pain associated with the > change if they don't care about enabling it. This is basically the classic > "should I use a feature flag or not?" discussion and we need to have it on > a case-by-case basis. > > I think most users are not going to care to manage versions for a bunch of > different features. The IBP today has many shortcomings, but at least it's > tied to a version that users understand (i.e. the release version). How > would users know after upgrading to Kafka 3.1, for example, that they need > to upgrade the metadata.version to 3 and offsets.version to 4 (or > whatever)? It's a lot of overhead trying to understand all of the potential > features and what each upgrade actually means to them. I am wondering if we > could give them something more convenient which is tied to the release > version. For example, maybe we could use a command like `kafka-features > upgrade --release 3.1`, which the broker would then translate to an upgrade > to the latest versions of the individual features at the time of the 3.1 > release. Basically it's just a static map from release version to feature > versions to make the upgrade process more convenient. > > Thanks, > Jason > > > > > On Wed, Nov 17, 2021 at 6:20 PM Jason Gustafson > wrote: > > > A few additional questions: > > > > 1. Currently the IBP tells us what version of individual inter-broker > RPCs > > will be used. I think the plan in this KIP is to use ApiVersions request > > instead to find the highest compatible version (just like clients). Do I > > have that right? > > > > 2. The following wasn't very clear to me: > > > > > Brokers will be able to observe changes to metadata.version by > observing > > the metadata log, and could then submit a new ApiVersionsRequest to the > > other Kafka nodes. > > > > Is the purpose of submitting new ApiVersions requests to update the > > features or the RPC versions? Does metadata.version also influence the > > versions that a broker advertises? It would help to have more detail > about > > this. > > > > 3. I imagine users will want to know before performing an upgrade whether > > downgrading will be safe. Would the --dry-run flag tell them this? > > > > Thanks, > > Jason > > > > > > > > > > > > On Wed, Nov 17, 2021 at 3:55 PM Colin McCabe wrote: > > > >> On Wed, Nov 17, 2021, at 11:28, Jason Gustafson wrote: > >> > Hi David, > >> > > >> > Forgive me if this ground has been covered already. Today, we have a > few > >> > other things that we have latched onto the IBP, such as upgrades to > the > >> > format of records in __consumer_offsets. I've been assuming that > >> > metadata.version is not covering this. Is that right or is there some > >> other > >> > plan to take care of cases like this? > >> > > >> > >> I think metadata.version could cover changes to things like > >> __consumer_offsets, if people want it to. Or to put it another way, > that is > >> out of scope for this KIP. > >> > >> Like David said, basically it boils down to creating a feature flag for > >> the new proposed __consumer_offsets version, or using a new > >> IBP/metadata.version for it. Both approaches have pros and cons. Using > an > >> IBP/metadata.version bump reduces the size of the testing matrix. But > using > >> a feature flag allows people to avoid any bugs or pain associated with > the > >> change if they don't care about enabling it. This is basically the > classic > >> "should I use a feature flag or not?" discussion and we need to have it > on > >> a case-by-case b
Re: [DISCUSS] KIP-778 KRaft Upgrades
Hi Colin/David, > Like David said, basically it boils down to creating a feature flag for the new proposed __consumer_offsets version, or using a new IBP/metadata.version for it. Both approaches have pros and cons. Using an IBP/metadata.version bump reduces the size of the testing matrix. But using a feature flag allows people to avoid any bugs or pain associated with the change if they don't care about enabling it. This is basically the classic "should I use a feature flag or not?" discussion and we need to have it on a case-by-case basis. I think most users are not going to care to manage versions for a bunch of different features. The IBP today has many shortcomings, but at least it's tied to a version that users understand (i.e. the release version). How would users know after upgrading to Kafka 3.1, for example, that they need to upgrade the metadata.version to 3 and offsets.version to 4 (or whatever)? It's a lot of overhead trying to understand all of the potential features and what each upgrade actually means to them. I am wondering if we could give them something more convenient which is tied to the release version. For example, maybe we could use a command like `kafka-features upgrade --release 3.1`, which the broker would then translate to an upgrade to the latest versions of the individual features at the time of the 3.1 release. Basically it's just a static map from release version to feature versions to make the upgrade process more convenient. Thanks, Jason On Wed, Nov 17, 2021 at 6:20 PM Jason Gustafson wrote: > A few additional questions: > > 1. Currently the IBP tells us what version of individual inter-broker RPCs > will be used. I think the plan in this KIP is to use ApiVersions request > instead to find the highest compatible version (just like clients). Do I > have that right? > > 2. The following wasn't very clear to me: > > > Brokers will be able to observe changes to metadata.version by observing > the metadata log, and could then submit a new ApiVersionsRequest to the > other Kafka nodes. > > Is the purpose of submitting new ApiVersions requests to update the > features or the RPC versions? Does metadata.version also influence the > versions that a broker advertises? It would help to have more detail about > this. > > 3. I imagine users will want to know before performing an upgrade whether > downgrading will be safe. Would the --dry-run flag tell them this? > > Thanks, > Jason > > > > > > On Wed, Nov 17, 2021 at 3:55 PM Colin McCabe wrote: > >> On Wed, Nov 17, 2021, at 11:28, Jason Gustafson wrote: >> > Hi David, >> > >> > Forgive me if this ground has been covered already. Today, we have a few >> > other things that we have latched onto the IBP, such as upgrades to the >> > format of records in __consumer_offsets. I've been assuming that >> > metadata.version is not covering this. Is that right or is there some >> other >> > plan to take care of cases like this? >> > >> >> I think metadata.version could cover changes to things like >> __consumer_offsets, if people want it to. Or to put it another way, that is >> out of scope for this KIP. >> >> Like David said, basically it boils down to creating a feature flag for >> the new proposed __consumer_offsets version, or using a new >> IBP/metadata.version for it. Both approaches have pros and cons. Using an >> IBP/metadata.version bump reduces the size of the testing matrix. But using >> a feature flag allows people to avoid any bugs or pain associated with the >> change if they don't care about enabling it. This is basically the classic >> "should I use a feature flag or not?" discussion and we need to have it on >> a case-by-case basis. >> >> I think it's worth calling out that having a 1:1 mapping between IBP >> versions and metadata.versions will result in some metadata.versions that >> "don't do anything" (aka they do the same thing as the previous >> metadata.version). For example, if we change StopReplicaRequest again, that >> will not affect KRaft mode, but probably would require an IBP bump and >> hence a metadata.version bump. I think that's OK. It's not that different >> from updating your IBP and getting support for ZStandard, when your >> deployment doesn't use ZStandard compression. >> >> best, >> Colin >> >> > Thanks, >> > Jason >> > >> > >> > >> > On Wed, Nov 17, 2021 at 10:17 AM Jun Rao >> wrote: >> > >> >> Hi, Colin, >> >> >> >> Thanks for the reply. >> >> >> >> For case b, I am not sure that I understand your suggestion. Does "each >> >> subsequent level for metadata.version corresponds to an IBP version" >> mean >> >> that we need to keep IBP forever? Could you describe the upgrade >> process in >> >> this case? >> >> >> >> Jun >> >> >> >> On Tue, Nov 16, 2021 at 3:45 PM Colin McCabe >> wrote: >> >> >> >> > On Tue, Nov 16, 2021, at 15:13, Jun Rao wrote: >> >> > > Hi, David, Colin, >> >> > > >> >> > > Thanks for the reply. >> >> > > >> >> > > 16. Discussed with David offline a bit. We have 3 cases. >> >
Re: [DISCUSS] KIP-778 KRaft Upgrades
A few additional questions: 1. Currently the IBP tells us what version of individual inter-broker RPCs will be used. I think the plan in this KIP is to use ApiVersions request instead to find the highest compatible version (just like clients). Do I have that right? 2. The following wasn't very clear to me: > Brokers will be able to observe changes to metadata.version by observing the metadata log, and could then submit a new ApiVersionsRequest to the other Kafka nodes. Is the purpose of submitting new ApiVersions requests to update the features or the RPC versions? Does metadata.version also influence the versions that a broker advertises? It would help to have more detail about this. 3. I imagine users will want to know before performing an upgrade whether downgrading will be safe. Would the --dry-run flag tell them this? Thanks, Jason On Wed, Nov 17, 2021 at 3:55 PM Colin McCabe wrote: > On Wed, Nov 17, 2021, at 11:28, Jason Gustafson wrote: > > Hi David, > > > > Forgive me if this ground has been covered already. Today, we have a few > > other things that we have latched onto the IBP, such as upgrades to the > > format of records in __consumer_offsets. I've been assuming that > > metadata.version is not covering this. Is that right or is there some > other > > plan to take care of cases like this? > > > > I think metadata.version could cover changes to things like > __consumer_offsets, if people want it to. Or to put it another way, that is > out of scope for this KIP. > > Like David said, basically it boils down to creating a feature flag for > the new proposed __consumer_offsets version, or using a new > IBP/metadata.version for it. Both approaches have pros and cons. Using an > IBP/metadata.version bump reduces the size of the testing matrix. But using > a feature flag allows people to avoid any bugs or pain associated with the > change if they don't care about enabling it. This is basically the classic > "should I use a feature flag or not?" discussion and we need to have it on > a case-by-case basis. > > I think it's worth calling out that having a 1:1 mapping between IBP > versions and metadata.versions will result in some metadata.versions that > "don't do anything" (aka they do the same thing as the previous > metadata.version). For example, if we change StopReplicaRequest again, that > will not affect KRaft mode, but probably would require an IBP bump and > hence a metadata.version bump. I think that's OK. It's not that different > from updating your IBP and getting support for ZStandard, when your > deployment doesn't use ZStandard compression. > > best, > Colin > > > Thanks, > > Jason > > > > > > > > On Wed, Nov 17, 2021 at 10:17 AM Jun Rao > wrote: > > > >> Hi, Colin, > >> > >> Thanks for the reply. > >> > >> For case b, I am not sure that I understand your suggestion. Does "each > >> subsequent level for metadata.version corresponds to an IBP version" > mean > >> that we need to keep IBP forever? Could you describe the upgrade > process in > >> this case? > >> > >> Jun > >> > >> On Tue, Nov 16, 2021 at 3:45 PM Colin McCabe > wrote: > >> > >> > On Tue, Nov 16, 2021, at 15:13, Jun Rao wrote: > >> > > Hi, David, Colin, > >> > > > >> > > Thanks for the reply. > >> > > > >> > > 16. Discussed with David offline a bit. We have 3 cases. > >> > > a. We upgrade from an old version where the metadata.version has > >> already > >> > > been finalized. In this case it makes sense to stay with that > feature > >> > > version after the upgrade. > >> > > >> > +1 > >> > > >> > > b. We upgrade from an old version where no metadata.version has been > >> > > finalized. In this case, it makes sense to leave metadata.version > >> > disabled > >> > > since we don't know if all brokers have been upgraded. > >> > > >> > This is the scenario I was hoping to avoid by saying that ALL KRaft > >> > clusters have metadata.version of at least 1, and each subsequent > level > >> for > >> > metadata.version corresponds to an IBP version. The existing KRaft > >> clusters > >> > in 3.0 and earlier are preview (not for production) so I think this > >> change > >> > is OK for 3.x (given that it affects only KRaft). Then IBP is > irrelevant > >> > for KRaft clusters (the config is ignored, possibly with a WARN or > ERROR > >> > message generated if it is set). > >> > > >> > > c. We are starting from a brand new cluster and of course no > >> > > metadata.version has been finalized. In this case, the KIP says it > will > >> > > pick the metadata.version in meta.properties. In the common case, > >> people > >> > > probably won't set the metadata.version in the meta.properties file > >> > > explicitly. So, it will be useful to put a default (stable) version > >> there > >> > > when the meta.properties. > >> > > >> > Hmm. I was assuming that clusters where the admin didn't specify any > >> > metadata.version during formatting would get the latest > metadata.version. > >> > Partly, because this is what we do for IBP toda
Re: [DISCUSS] KIP-778 KRaft Upgrades
On Wed, Nov 17, 2021, at 11:28, Jason Gustafson wrote: > Hi David, > > Forgive me if this ground has been covered already. Today, we have a few > other things that we have latched onto the IBP, such as upgrades to the > format of records in __consumer_offsets. I've been assuming that > metadata.version is not covering this. Is that right or is there some other > plan to take care of cases like this? > I think metadata.version could cover changes to things like __consumer_offsets, if people want it to. Or to put it another way, that is out of scope for this KIP. Like David said, basically it boils down to creating a feature flag for the new proposed __consumer_offsets version, or using a new IBP/metadata.version for it. Both approaches have pros and cons. Using an IBP/metadata.version bump reduces the size of the testing matrix. But using a feature flag allows people to avoid any bugs or pain associated with the change if they don't care about enabling it. This is basically the classic "should I use a feature flag or not?" discussion and we need to have it on a case-by-case basis. I think it's worth calling out that having a 1:1 mapping between IBP versions and metadata.versions will result in some metadata.versions that "don't do anything" (aka they do the same thing as the previous metadata.version). For example, if we change StopReplicaRequest again, that will not affect KRaft mode, but probably would require an IBP bump and hence a metadata.version bump. I think that's OK. It's not that different from updating your IBP and getting support for ZStandard, when your deployment doesn't use ZStandard compression. best, Colin > Thanks, > Jason > > > > On Wed, Nov 17, 2021 at 10:17 AM Jun Rao wrote: > >> Hi, Colin, >> >> Thanks for the reply. >> >> For case b, I am not sure that I understand your suggestion. Does "each >> subsequent level for metadata.version corresponds to an IBP version" mean >> that we need to keep IBP forever? Could you describe the upgrade process in >> this case? >> >> Jun >> >> On Tue, Nov 16, 2021 at 3:45 PM Colin McCabe wrote: >> >> > On Tue, Nov 16, 2021, at 15:13, Jun Rao wrote: >> > > Hi, David, Colin, >> > > >> > > Thanks for the reply. >> > > >> > > 16. Discussed with David offline a bit. We have 3 cases. >> > > a. We upgrade from an old version where the metadata.version has >> already >> > > been finalized. In this case it makes sense to stay with that feature >> > > version after the upgrade. >> > >> > +1 >> > >> > > b. We upgrade from an old version where no metadata.version has been >> > > finalized. In this case, it makes sense to leave metadata.version >> > disabled >> > > since we don't know if all brokers have been upgraded. >> > >> > This is the scenario I was hoping to avoid by saying that ALL KRaft >> > clusters have metadata.version of at least 1, and each subsequent level >> for >> > metadata.version corresponds to an IBP version. The existing KRaft >> clusters >> > in 3.0 and earlier are preview (not for production) so I think this >> change >> > is OK for 3.x (given that it affects only KRaft). Then IBP is irrelevant >> > for KRaft clusters (the config is ignored, possibly with a WARN or ERROR >> > message generated if it is set). >> > >> > > c. We are starting from a brand new cluster and of course no >> > > metadata.version has been finalized. In this case, the KIP says it will >> > > pick the metadata.version in meta.properties. In the common case, >> people >> > > probably won't set the metadata.version in the meta.properties file >> > > explicitly. So, it will be useful to put a default (stable) version >> there >> > > when the meta.properties. >> > >> > Hmm. I was assuming that clusters where the admin didn't specify any >> > metadata.version during formatting would get the latest metadata.version. >> > Partly, because this is what we do for IBP today. It would be good to >> > clarify this... >> > >> > > >> > > Also, it would be useful to clarify that if a FeatureLevelRecord exists >> > for >> > > metadata.version, the metadata.version in meta.properties will be >> > ignored. >> > > >> > >> > Yeah, I agree. >> > >> > best, >> > Colin >> > >> > > Thanks, >> > > >> > > Jun >> > > >> > > >> > > On Tue, Nov 16, 2021 at 12:39 PM Colin McCabe >> > wrote: >> > > >> > >> On Fri, Nov 5, 2021, at 15:18, Jun Rao wrote: >> > >> > Hi, David, >> > >> > >> > >> > Thanks for the reply. >> > >> > >> > >> > 16. My first concern is that the KIP picks up meta.version >> > inconsistently >> > >> > during the deployment. If a new cluster is started, we pick up the >> > >> highest >> > >> > version. If we upgrade, we leave the feature version unchanged. >> > >> >> > >> Hi Jun, >> > >> >> > >> Thanks again for taking a look. >> > >> >> > >> The proposed behavior in KIP-778 is consistent with how it works >> today. >> > >> Upgrading the software is distinct from upgrading the IBP. >> > >> >> > >> I think it is important to keep these two operations ("upgrading
Re: [DISCUSS] KIP-778 KRaft Upgrades
On Wed, Nov 17, 2021, at 12:28, David Arthur wrote: > Colin, > > I wasn't intending to suggest that IBP and metadata.version co-exist long > term in KRaft. I was thinking we would have one final IBP that would signal > KRaft clusters to start looking at metadata.version instead. This > was working under the assumption that we should be able to upgrade 3.0 > clusters to a version with metadata.version. I don't think it would be too > difficult since we could treat the absence of a feature flag as an implicit > version 0 which would be compatible with 3.0. > Hi David, Hmm... do we really have to have "one final IBP" for upgraded 3.0 / 3.1 KRaft clusters? It seems like we could just initialize their cluster.metadata to 1 when the software is upgraded to 3.2. Admittedly this is a bit of a hack -- in general, we don't want to couple software upgrades and metadata version bumps. But 3.0 and 3.1 were "preview," not intended for production, and this is a pretty minor speed bump for someone testing out pre-production software. This would greatly simplify the system by getting rid of IBP completely in KRaft clusters, by substituting it with metadata.version. > > Assuming this lands in 3.2, we would have > > SoftwareIBPmetadata.versioneffective version > 3.0 3.0- 0 > 3.2 3.0- 0 > 3.2 3.2- 0 > 3.2 3.21 1 > > where metadata.version=0 is compatible with what we have today in KRaft. > > If we don't support 3.0 -> 3.x KRaft upgrades, then early adopters may be > rather inconvenienced :) I can't say for sure that the extra complexity is > worth the convenience of allowing upgrades from the preview versions. > So let's drill down to what the inconvenience would be. It would basically be an unexpected (if they didn't read the upgrade docs) IBP bump to whatever the IBP is in 3.2. Assuming that they started on KAFKA_3_0_IV0 and got auto-upgraded to the latest, after the upgrade they would have these additional features: > 1. Introduce ListOffsets V7 (KIP-724) > 2. Add topic IDs to Fetch requests/responses (KIP-516) That seems pretty minor (again, noting that this ONLY applies to KRaft 3.0 Preview users, NOT to ZK users). I don't think it's worth keeping IBP around just to prevent this inconvenience. We have a limited amount of time and effort available to manage versioning issues. I don't think it's realistic to have to think about the cross-product of all possible metadata.version and IBP values when making a change. We also have to expect new features to keep going in while the transition is taking place. That's why I think having a 1:1 mapping between new IBP versions and metadata.versions is helpful. best, Colin > > We probably need to make a decision on this since it impacts a few things > in the KIP. What do folks think? > > -David > > > On Wed, Nov 17, 2021 at 2:28 PM Jason Gustafson > wrote: > >> Hi David, >> >> Forgive me if this ground has been covered already. Today, we have a few >> other things that we have latched onto the IBP, such as upgrades to the >> format of records in __consumer_offsets. I've been assuming that >> metadata.version is not covering this. Is that right or is there some other >> plan to take care of cases like this? >> >> Thanks, >> Jason >> >> >> >> On Wed, Nov 17, 2021 at 10:17 AM Jun Rao wrote: >> >> > Hi, Colin, >> > >> > Thanks for the reply. >> > >> > For case b, I am not sure that I understand your suggestion. Does "each >> > subsequent level for metadata.version corresponds to an IBP version" mean >> > that we need to keep IBP forever? Could you describe the upgrade process >> in >> > this case? >> > >> > Jun >> > >> > On Tue, Nov 16, 2021 at 3:45 PM Colin McCabe wrote: >> > >> > > On Tue, Nov 16, 2021, at 15:13, Jun Rao wrote: >> > > > Hi, David, Colin, >> > > > >> > > > Thanks for the reply. >> > > > >> > > > 16. Discussed with David offline a bit. We have 3 cases. >> > > > a. We upgrade from an old version where the metadata.version has >> > already >> > > > been finalized. In this case it makes sense to stay with that feature >> > > > version after the upgrade. >> > > >> > > +1 >> > > >> > > > b. We upgrade from an old version where no metadata.version has been >> > > > finalized. In this case, it makes sense to leave metadata.version >> > > disabled >> > > > since we don't know if all brokers have been upgraded. >> > > >> > > This is the scenario I was hoping to avoid by saying that ALL KRaft >> > > clusters have metadata.version of at least 1, and each subsequent level >> > for >> > > metadata.version corresponds to an IBP version. The existing KRaft >> > clusters >> > > in 3.0 and earlier are preview (not for production) so I think this >> > change >> > > is OK for 3.x (given that it affects only KRaft). Then IBP is >> irrelevant >> > > for KRaft clusters (the config is ignored, possibly with a WARN or >> ERROR >> > >
Re: [DISCUSS] KIP-778 KRaft Upgrades
Jason, thanks for the review! No, I don't think that has been covered explicitly and no metadata.version will not gate things like our offsets format. The intention is that we will introduce new KIP-584 feature flags for those formats once the need arises. For ZK clusters, we'll probably keep using the IBP mechanism and only worry about introducing the feature flag for KRaft clusters. I'll add a short section to the KIP with this clarification. Thanks! -David On Wed, Nov 17, 2021 at 3:28 PM David Arthur wrote: > Colin, > > I wasn't intending to suggest that IBP and metadata.version co-exist long > term in KRaft. I was thinking we would have one final IBP that would signal > KRaft clusters to start looking at metadata.version instead. This > was working under the assumption that we should be able to upgrade 3.0 > clusters to a version with metadata.version. I don't think it would be too > difficult since we could treat the absence of a feature flag as an implicit > version 0 which would be compatible with 3.0. > > Assuming this lands in 3.2, we would have > > SoftwareIBPmetadata.versioneffective version > 3.0 3.0- 0 > 3.2 3.0- 0 > 3.2 3.2- 0 > 3.2 3.21 1 > > where metadata.version=0 is compatible with what we have today in KRaft. > > If we don't support 3.0 -> 3.x KRaft upgrades, then early adopters may be > rather inconvenienced :) I can't say for sure that the extra complexity is > worth the convenience of allowing upgrades from the preview versions. > > We probably need to make a decision on this since it impacts a few things > in the KIP. What do folks think? > > -David > > > On Wed, Nov 17, 2021 at 2:28 PM Jason Gustafson > wrote: > >> Hi David, >> >> Forgive me if this ground has been covered already. Today, we have a few >> other things that we have latched onto the IBP, such as upgrades to the >> format of records in __consumer_offsets. I've been assuming that >> metadata.version is not covering this. Is that right or is there some >> other >> plan to take care of cases like this? >> >> Thanks, >> Jason >> >> >> >> On Wed, Nov 17, 2021 at 10:17 AM Jun Rao >> wrote: >> >> > Hi, Colin, >> > >> > Thanks for the reply. >> > >> > For case b, I am not sure that I understand your suggestion. Does "each >> > subsequent level for metadata.version corresponds to an IBP version" >> mean >> > that we need to keep IBP forever? Could you describe the upgrade >> process in >> > this case? >> > >> > Jun >> > >> > On Tue, Nov 16, 2021 at 3:45 PM Colin McCabe >> wrote: >> > >> > > On Tue, Nov 16, 2021, at 15:13, Jun Rao wrote: >> > > > Hi, David, Colin, >> > > > >> > > > Thanks for the reply. >> > > > >> > > > 16. Discussed with David offline a bit. We have 3 cases. >> > > > a. We upgrade from an old version where the metadata.version has >> > already >> > > > been finalized. In this case it makes sense to stay with that >> feature >> > > > version after the upgrade. >> > > >> > > +1 >> > > >> > > > b. We upgrade from an old version where no metadata.version has been >> > > > finalized. In this case, it makes sense to leave metadata.version >> > > disabled >> > > > since we don't know if all brokers have been upgraded. >> > > >> > > This is the scenario I was hoping to avoid by saying that ALL KRaft >> > > clusters have metadata.version of at least 1, and each subsequent >> level >> > for >> > > metadata.version corresponds to an IBP version. The existing KRaft >> > clusters >> > > in 3.0 and earlier are preview (not for production) so I think this >> > change >> > > is OK for 3.x (given that it affects only KRaft). Then IBP is >> irrelevant >> > > for KRaft clusters (the config is ignored, possibly with a WARN or >> ERROR >> > > message generated if it is set). >> > > >> > > > c. We are starting from a brand new cluster and of course no >> > > > metadata.version has been finalized. In this case, the KIP says it >> will >> > > > pick the metadata.version in meta.properties. In the common case, >> > people >> > > > probably won't set the metadata.version in the meta.properties file >> > > > explicitly. So, it will be useful to put a default (stable) version >> > there >> > > > when the meta.properties. >> > > >> > > Hmm. I was assuming that clusters where the admin didn't specify any >> > > metadata.version during formatting would get the latest >> metadata.version. >> > > Partly, because this is what we do for IBP today. It would be good to >> > > clarify this... >> > > >> > > > >> > > > Also, it would be useful to clarify that if a FeatureLevelRecord >> exists >> > > for >> > > > metadata.version, the metadata.version in meta.properties will be >> > > ignored. >> > > > >> > > >> > > Yeah, I agree. >> > > >> > > best, >> > > Colin >> > > >> > > > Thanks, >> > > > >> > > > Jun >> > > > >> > > > >> > > > On Tue, Nov 16, 2021 at 12:39 PM Colin McCabe >> > > wrote: >> > > > >> > > >>
Re: [DISCUSS] KIP-778 KRaft Upgrades
Colin, I wasn't intending to suggest that IBP and metadata.version co-exist long term in KRaft. I was thinking we would have one final IBP that would signal KRaft clusters to start looking at metadata.version instead. This was working under the assumption that we should be able to upgrade 3.0 clusters to a version with metadata.version. I don't think it would be too difficult since we could treat the absence of a feature flag as an implicit version 0 which would be compatible with 3.0. Assuming this lands in 3.2, we would have SoftwareIBPmetadata.versioneffective version 3.0 3.0- 0 3.2 3.0- 0 3.2 3.2- 0 3.2 3.21 1 where metadata.version=0 is compatible with what we have today in KRaft. If we don't support 3.0 -> 3.x KRaft upgrades, then early adopters may be rather inconvenienced :) I can't say for sure that the extra complexity is worth the convenience of allowing upgrades from the preview versions. We probably need to make a decision on this since it impacts a few things in the KIP. What do folks think? -David On Wed, Nov 17, 2021 at 2:28 PM Jason Gustafson wrote: > Hi David, > > Forgive me if this ground has been covered already. Today, we have a few > other things that we have latched onto the IBP, such as upgrades to the > format of records in __consumer_offsets. I've been assuming that > metadata.version is not covering this. Is that right or is there some other > plan to take care of cases like this? > > Thanks, > Jason > > > > On Wed, Nov 17, 2021 at 10:17 AM Jun Rao wrote: > > > Hi, Colin, > > > > Thanks for the reply. > > > > For case b, I am not sure that I understand your suggestion. Does "each > > subsequent level for metadata.version corresponds to an IBP version" mean > > that we need to keep IBP forever? Could you describe the upgrade process > in > > this case? > > > > Jun > > > > On Tue, Nov 16, 2021 at 3:45 PM Colin McCabe wrote: > > > > > On Tue, Nov 16, 2021, at 15:13, Jun Rao wrote: > > > > Hi, David, Colin, > > > > > > > > Thanks for the reply. > > > > > > > > 16. Discussed with David offline a bit. We have 3 cases. > > > > a. We upgrade from an old version where the metadata.version has > > already > > > > been finalized. In this case it makes sense to stay with that feature > > > > version after the upgrade. > > > > > > +1 > > > > > > > b. We upgrade from an old version where no metadata.version has been > > > > finalized. In this case, it makes sense to leave metadata.version > > > disabled > > > > since we don't know if all brokers have been upgraded. > > > > > > This is the scenario I was hoping to avoid by saying that ALL KRaft > > > clusters have metadata.version of at least 1, and each subsequent level > > for > > > metadata.version corresponds to an IBP version. The existing KRaft > > clusters > > > in 3.0 and earlier are preview (not for production) so I think this > > change > > > is OK for 3.x (given that it affects only KRaft). Then IBP is > irrelevant > > > for KRaft clusters (the config is ignored, possibly with a WARN or > ERROR > > > message generated if it is set). > > > > > > > c. We are starting from a brand new cluster and of course no > > > > metadata.version has been finalized. In this case, the KIP says it > will > > > > pick the metadata.version in meta.properties. In the common case, > > people > > > > probably won't set the metadata.version in the meta.properties file > > > > explicitly. So, it will be useful to put a default (stable) version > > there > > > > when the meta.properties. > > > > > > Hmm. I was assuming that clusters where the admin didn't specify any > > > metadata.version during formatting would get the latest > metadata.version. > > > Partly, because this is what we do for IBP today. It would be good to > > > clarify this... > > > > > > > > > > > Also, it would be useful to clarify that if a FeatureLevelRecord > exists > > > for > > > > metadata.version, the metadata.version in meta.properties will be > > > ignored. > > > > > > > > > > Yeah, I agree. > > > > > > best, > > > Colin > > > > > > > Thanks, > > > > > > > > Jun > > > > > > > > > > > > On Tue, Nov 16, 2021 at 12:39 PM Colin McCabe > > > wrote: > > > > > > > >> On Fri, Nov 5, 2021, at 15:18, Jun Rao wrote: > > > >> > Hi, David, > > > >> > > > > >> > Thanks for the reply. > > > >> > > > > >> > 16. My first concern is that the KIP picks up meta.version > > > inconsistently > > > >> > during the deployment. If a new cluster is started, we pick up the > > > >> highest > > > >> > version. If we upgrade, we leave the feature version unchanged. > > > >> > > > >> Hi Jun, > > > >> > > > >> Thanks again for taking a look. > > > >> > > > >> The proposed behavior in KIP-778 is consistent with how it works > > today. > > > >> Upgrading the software is distinct from upgrading the IBP. > > > >> > > > >> I think it is important to keep these two operation
Re: [DISCUSS] KIP-778 KRaft Upgrades
Hi David, Forgive me if this ground has been covered already. Today, we have a few other things that we have latched onto the IBP, such as upgrades to the format of records in __consumer_offsets. I've been assuming that metadata.version is not covering this. Is that right or is there some other plan to take care of cases like this? Thanks, Jason On Wed, Nov 17, 2021 at 10:17 AM Jun Rao wrote: > Hi, Colin, > > Thanks for the reply. > > For case b, I am not sure that I understand your suggestion. Does "each > subsequent level for metadata.version corresponds to an IBP version" mean > that we need to keep IBP forever? Could you describe the upgrade process in > this case? > > Jun > > On Tue, Nov 16, 2021 at 3:45 PM Colin McCabe wrote: > > > On Tue, Nov 16, 2021, at 15:13, Jun Rao wrote: > > > Hi, David, Colin, > > > > > > Thanks for the reply. > > > > > > 16. Discussed with David offline a bit. We have 3 cases. > > > a. We upgrade from an old version where the metadata.version has > already > > > been finalized. In this case it makes sense to stay with that feature > > > version after the upgrade. > > > > +1 > > > > > b. We upgrade from an old version where no metadata.version has been > > > finalized. In this case, it makes sense to leave metadata.version > > disabled > > > since we don't know if all brokers have been upgraded. > > > > This is the scenario I was hoping to avoid by saying that ALL KRaft > > clusters have metadata.version of at least 1, and each subsequent level > for > > metadata.version corresponds to an IBP version. The existing KRaft > clusters > > in 3.0 and earlier are preview (not for production) so I think this > change > > is OK for 3.x (given that it affects only KRaft). Then IBP is irrelevant > > for KRaft clusters (the config is ignored, possibly with a WARN or ERROR > > message generated if it is set). > > > > > c. We are starting from a brand new cluster and of course no > > > metadata.version has been finalized. In this case, the KIP says it will > > > pick the metadata.version in meta.properties. In the common case, > people > > > probably won't set the metadata.version in the meta.properties file > > > explicitly. So, it will be useful to put a default (stable) version > there > > > when the meta.properties. > > > > Hmm. I was assuming that clusters where the admin didn't specify any > > metadata.version during formatting would get the latest metadata.version. > > Partly, because this is what we do for IBP today. It would be good to > > clarify this... > > > > > > > > Also, it would be useful to clarify that if a FeatureLevelRecord exists > > for > > > metadata.version, the metadata.version in meta.properties will be > > ignored. > > > > > > > Yeah, I agree. > > > > best, > > Colin > > > > > Thanks, > > > > > > Jun > > > > > > > > > On Tue, Nov 16, 2021 at 12:39 PM Colin McCabe > > wrote: > > > > > >> On Fri, Nov 5, 2021, at 15:18, Jun Rao wrote: > > >> > Hi, David, > > >> > > > >> > Thanks for the reply. > > >> > > > >> > 16. My first concern is that the KIP picks up meta.version > > inconsistently > > >> > during the deployment. If a new cluster is started, we pick up the > > >> highest > > >> > version. If we upgrade, we leave the feature version unchanged. > > >> > > >> Hi Jun, > > >> > > >> Thanks again for taking a look. > > >> > > >> The proposed behavior in KIP-778 is consistent with how it works > today. > > >> Upgrading the software is distinct from upgrading the IBP. > > >> > > >> I think it is important to keep these two operations ("upgrading > > >> IBP/metadata version" and "upgrading software version") separate. If > > they > > >> are coupled it will create a situation where software upgrades are > > >> difficult and dangerous. > > >> > > >> Consider a situation where you find some bug in your current software, > > and > > >> you want to upgrade to new software that fixes the bug. If upgrades > and > > IBP > > >> bumps are coupled, you can't do this without also bumping the IBP, > > which is > > >> usually considered a high-risk change. That means that either you have > > to > > >> make a special build that includes only the fix (time-consuming and > > >> error-prone), live with the bug for longer, or be very conservative > > about > > >> ever introducing new IBP/metadata versions. None of those are really > > good > > >> choices. > > >> > > >> > Intuitively, it seems that independent of how a cluster is deployed, > > we > > >> > should always pick the same feature version. > > >> > > >> I think it makes sense to draw a distinction between upgrading an > > existing > > >> cluster and deploying a new one. What most people want out of upgrades > > is > > >> that things should keep working, but with bug fixes. If we change > that, > > it > > >> just makes people more reluctant to upgrade (which is always a > > problem...) > > >> > > >> > I think we need to think this through in this KIP. My second concern > > is > > >> > that as a particular version mature
Re: [DISCUSS] KIP-778 KRaft Upgrades
Hi, Colin, Thanks for the reply. For case b, I am not sure that I understand your suggestion. Does "each subsequent level for metadata.version corresponds to an IBP version" mean that we need to keep IBP forever? Could you describe the upgrade process in this case? Jun On Tue, Nov 16, 2021 at 3:45 PM Colin McCabe wrote: > On Tue, Nov 16, 2021, at 15:13, Jun Rao wrote: > > Hi, David, Colin, > > > > Thanks for the reply. > > > > 16. Discussed with David offline a bit. We have 3 cases. > > a. We upgrade from an old version where the metadata.version has already > > been finalized. In this case it makes sense to stay with that feature > > version after the upgrade. > > +1 > > > b. We upgrade from an old version where no metadata.version has been > > finalized. In this case, it makes sense to leave metadata.version > disabled > > since we don't know if all brokers have been upgraded. > > This is the scenario I was hoping to avoid by saying that ALL KRaft > clusters have metadata.version of at least 1, and each subsequent level for > metadata.version corresponds to an IBP version. The existing KRaft clusters > in 3.0 and earlier are preview (not for production) so I think this change > is OK for 3.x (given that it affects only KRaft). Then IBP is irrelevant > for KRaft clusters (the config is ignored, possibly with a WARN or ERROR > message generated if it is set). > > > c. We are starting from a brand new cluster and of course no > > metadata.version has been finalized. In this case, the KIP says it will > > pick the metadata.version in meta.properties. In the common case, people > > probably won't set the metadata.version in the meta.properties file > > explicitly. So, it will be useful to put a default (stable) version there > > when the meta.properties. > > Hmm. I was assuming that clusters where the admin didn't specify any > metadata.version during formatting would get the latest metadata.version. > Partly, because this is what we do for IBP today. It would be good to > clarify this... > > > > > Also, it would be useful to clarify that if a FeatureLevelRecord exists > for > > metadata.version, the metadata.version in meta.properties will be > ignored. > > > > Yeah, I agree. > > best, > Colin > > > Thanks, > > > > Jun > > > > > > On Tue, Nov 16, 2021 at 12:39 PM Colin McCabe > wrote: > > > >> On Fri, Nov 5, 2021, at 15:18, Jun Rao wrote: > >> > Hi, David, > >> > > >> > Thanks for the reply. > >> > > >> > 16. My first concern is that the KIP picks up meta.version > inconsistently > >> > during the deployment. If a new cluster is started, we pick up the > >> highest > >> > version. If we upgrade, we leave the feature version unchanged. > >> > >> Hi Jun, > >> > >> Thanks again for taking a look. > >> > >> The proposed behavior in KIP-778 is consistent with how it works today. > >> Upgrading the software is distinct from upgrading the IBP. > >> > >> I think it is important to keep these two operations ("upgrading > >> IBP/metadata version" and "upgrading software version") separate. If > they > >> are coupled it will create a situation where software upgrades are > >> difficult and dangerous. > >> > >> Consider a situation where you find some bug in your current software, > and > >> you want to upgrade to new software that fixes the bug. If upgrades and > IBP > >> bumps are coupled, you can't do this without also bumping the IBP, > which is > >> usually considered a high-risk change. That means that either you have > to > >> make a special build that includes only the fix (time-consuming and > >> error-prone), live with the bug for longer, or be very conservative > about > >> ever introducing new IBP/metadata versions. None of those are really > good > >> choices. > >> > >> > Intuitively, it seems that independent of how a cluster is deployed, > we > >> > should always pick the same feature version. > >> > >> I think it makes sense to draw a distinction between upgrading an > existing > >> cluster and deploying a new one. What most people want out of upgrades > is > >> that things should keep working, but with bug fixes. If we change that, > it > >> just makes people more reluctant to upgrade (which is always a > problem...) > >> > >> > I think we need to think this through in this KIP. My second concern > is > >> > that as a particular version matures, it's inconvenient for a user to > >> manually > >> > upgrade every feature version. As long as we have a path to achieve > that > >> in > >> > the future, we don't need to address that in this KIP. > >> > >> If people are managing a large number of Kafka clusters, they will want > to > >> do some sort of A/B testing with IBP/metadata versions. So if you have > 1000 > >> Kafka clusters, you roll out the new IBP version to 10 of them and see > how > >> it goes. If that goes well, you roll it out to more, etc. > >> > >> So, the automation needs to be at the cluster management layer, not at > the > >> Kafka layer. Each Kafka cluster doesn't know how well thin
Re: [DISCUSS] KIP-778 KRaft Upgrades
On Tue, Nov 16, 2021, at 15:13, Jun Rao wrote: > Hi, David, Colin, > > Thanks for the reply. > > 16. Discussed with David offline a bit. We have 3 cases. > a. We upgrade from an old version where the metadata.version has already > been finalized. In this case it makes sense to stay with that feature > version after the upgrade. +1 > b. We upgrade from an old version where no metadata.version has been > finalized. In this case, it makes sense to leave metadata.version disabled > since we don't know if all brokers have been upgraded. This is the scenario I was hoping to avoid by saying that ALL KRaft clusters have metadata.version of at least 1, and each subsequent level for metadata.version corresponds to an IBP version. The existing KRaft clusters in 3.0 and earlier are preview (not for production) so I think this change is OK for 3.x (given that it affects only KRaft). Then IBP is irrelevant for KRaft clusters (the config is ignored, possibly with a WARN or ERROR message generated if it is set). > c. We are starting from a brand new cluster and of course no > metadata.version has been finalized. In this case, the KIP says it will > pick the metadata.version in meta.properties. In the common case, people > probably won't set the metadata.version in the meta.properties file > explicitly. So, it will be useful to put a default (stable) version there > when the meta.properties. Hmm. I was assuming that clusters where the admin didn't specify any metadata.version during formatting would get the latest metadata.version. Partly, because this is what we do for IBP today. It would be good to clarify this... > > Also, it would be useful to clarify that if a FeatureLevelRecord exists for > metadata.version, the metadata.version in meta.properties will be ignored. > Yeah, I agree. best, Colin > Thanks, > > Jun > > > On Tue, Nov 16, 2021 at 12:39 PM Colin McCabe wrote: > >> On Fri, Nov 5, 2021, at 15:18, Jun Rao wrote: >> > Hi, David, >> > >> > Thanks for the reply. >> > >> > 16. My first concern is that the KIP picks up meta.version inconsistently >> > during the deployment. If a new cluster is started, we pick up the >> highest >> > version. If we upgrade, we leave the feature version unchanged. >> >> Hi Jun, >> >> Thanks again for taking a look. >> >> The proposed behavior in KIP-778 is consistent with how it works today. >> Upgrading the software is distinct from upgrading the IBP. >> >> I think it is important to keep these two operations ("upgrading >> IBP/metadata version" and "upgrading software version") separate. If they >> are coupled it will create a situation where software upgrades are >> difficult and dangerous. >> >> Consider a situation where you find some bug in your current software, and >> you want to upgrade to new software that fixes the bug. If upgrades and IBP >> bumps are coupled, you can't do this without also bumping the IBP, which is >> usually considered a high-risk change. That means that either you have to >> make a special build that includes only the fix (time-consuming and >> error-prone), live with the bug for longer, or be very conservative about >> ever introducing new IBP/metadata versions. None of those are really good >> choices. >> >> > Intuitively, it seems that independent of how a cluster is deployed, we >> > should always pick the same feature version. >> >> I think it makes sense to draw a distinction between upgrading an existing >> cluster and deploying a new one. What most people want out of upgrades is >> that things should keep working, but with bug fixes. If we change that, it >> just makes people more reluctant to upgrade (which is always a problem...) >> >> > I think we need to think this through in this KIP. My second concern is >> > that as a particular version matures, it's inconvenient for a user to >> manually >> > upgrade every feature version. As long as we have a path to achieve that >> in >> > the future, we don't need to address that in this KIP. >> >> If people are managing a large number of Kafka clusters, they will want to >> do some sort of A/B testing with IBP/metadata versions. So if you have 1000 >> Kafka clusters, you roll out the new IBP version to 10 of them and see how >> it goes. If that goes well, you roll it out to more, etc. >> >> So, the automation needs to be at the cluster management layer, not at the >> Kafka layer. Each Kafka cluster doesn't know how well things went in the >> other 999 clusters. Automatically upgrading is a bad thing for the same >> reason Kafka automatically upgrading its own software version would be a >> bad thing -- it could lead to a disruption to a sensitive cluster at the >> wrong time. >> >> For people who are just managing one or two Kafka clusters, automatically >> upgrading feature versions isn't a big burden and can be done manually. >> This is all consistent with how IBP works today. >> >> Also, we already have a command-line option to the feature tool which >> upgrades all features to
Re: [DISCUSS] KIP-778 KRaft Upgrades
Hi, David, Colin, Thanks for the reply. 16. Discussed with David offline a bit. We have 3 cases. a. We upgrade from an old version where the metadata.version has already been finalized. In this case it makes sense to stay with that feature version after the upgrade. b. We upgrade from an old version where no metadata.version has been finalized. In this case, it makes sense to leave metadata.version disabled since we don't know if all brokers have been upgraded. c. We are starting from a brand new cluster and of course no metadata.version has been finalized. In this case, the KIP says it will pick the metadata.version in meta.properties. In the common case, people probably won't set the metadata.version in the meta.properties file explicitly. So, it will be useful to put a default (stable) version there when the meta.properties. Also, it would be useful to clarify that if a FeatureLevelRecord exists for metadata.version, the metadata.version in meta.properties will be ignored. Thanks, Jun On Tue, Nov 16, 2021 at 12:39 PM Colin McCabe wrote: > On Fri, Nov 5, 2021, at 15:18, Jun Rao wrote: > > Hi, David, > > > > Thanks for the reply. > > > > 16. My first concern is that the KIP picks up meta.version inconsistently > > during the deployment. If a new cluster is started, we pick up the > highest > > version. If we upgrade, we leave the feature version unchanged. > > Hi Jun, > > Thanks again for taking a look. > > The proposed behavior in KIP-778 is consistent with how it works today. > Upgrading the software is distinct from upgrading the IBP. > > I think it is important to keep these two operations ("upgrading > IBP/metadata version" and "upgrading software version") separate. If they > are coupled it will create a situation where software upgrades are > difficult and dangerous. > > Consider a situation where you find some bug in your current software, and > you want to upgrade to new software that fixes the bug. If upgrades and IBP > bumps are coupled, you can't do this without also bumping the IBP, which is > usually considered a high-risk change. That means that either you have to > make a special build that includes only the fix (time-consuming and > error-prone), live with the bug for longer, or be very conservative about > ever introducing new IBP/metadata versions. None of those are really good > choices. > > > Intuitively, it seems that independent of how a cluster is deployed, we > > should always pick the same feature version. > > I think it makes sense to draw a distinction between upgrading an existing > cluster and deploying a new one. What most people want out of upgrades is > that things should keep working, but with bug fixes. If we change that, it > just makes people more reluctant to upgrade (which is always a problem...) > > > I think we need to think this through in this KIP. My second concern is > > that as a particular version matures, it's inconvenient for a user to > manually > > upgrade every feature version. As long as we have a path to achieve that > in > > the future, we don't need to address that in this KIP. > > If people are managing a large number of Kafka clusters, they will want to > do some sort of A/B testing with IBP/metadata versions. So if you have 1000 > Kafka clusters, you roll out the new IBP version to 10 of them and see how > it goes. If that goes well, you roll it out to more, etc. > > So, the automation needs to be at the cluster management layer, not at the > Kafka layer. Each Kafka cluster doesn't know how well things went in the > other 999 clusters. Automatically upgrading is a bad thing for the same > reason Kafka automatically upgrading its own software version would be a > bad thing -- it could lead to a disruption to a sensitive cluster at the > wrong time. > > For people who are just managing one or two Kafka clusters, automatically > upgrading feature versions isn't a big burden and can be done manually. > This is all consistent with how IBP works today. > > Also, we already have a command-line option to the feature tool which > upgrades all features to the latest available, if that is what the > administrator desires. Perhaps we could add documentation saying that this > should be done as the last step of the upgrade. > > best, > Colin > > > > > 21. "./kafka-features.sh delete": Deleting a feature seems a bit weird > > since the logic is always there. Would it be better to use disable? > > > > Jun > > > > On Fri, Nov 5, 2021 at 8:11 AM David Arthur > > wrote: > > > >> Colin and Jun, thanks for the additional comments! > >> > >> Colin: > >> > >> > We've been talking about having an automated RPC compatibility checker > >> > >> Do we have a way to mark fields in schemas as deprecated? It can stay in > >> the RPC, it just complicates the logic a bit. > >> > >> > It would be nice if the active controller could validate that a > majority > >> of the quorum could use the proposed metadata.version. The active > >> controller should have this informati
Re: [DISCUSS] KIP-778 KRaft Upgrades
Yeah I agree that checking a majority of voters support the metadata.version is sufficient. What I was originally considering is whether (in the future) we could consider encoding the metadata.version value in the vote request as well, so that the elected leader is supposed to have a version that's supported by a majority of the quorum. Guozhang On Tue, Nov 16, 2021 at 2:02 PM Colin McCabe wrote: > On Tue, Nov 16, 2021, at 13:36, Guozhang Wang wrote: > > Hi Colin, > > > > If we allow downgrades which would be appended in metadata.version, then > > the length of the __cluster_medata log may not be safe to indicate higher > > versions, since older version records could still be appended later than > a > > new version record right? > > > > That's fair... the longer log could be at a lower metadata.version. > > However, can you think of a scenario where the upgrade / downgrade logic > doesn't protect us here? > > Checking that a majority of the voters support the new metadata.version > we're going to seems to cover all the cases I can think of. I guess there > could be time-of-check, time-of-use race conditions, but they only happen > if you are doing a feature downgrade at the same time as a software version > downgrade. This is something the cluster management software should not do > (I guess we should spell this out in the KIP) I think nobody would want to > do that since it would mean rolling the cluster while you're messing with > feature flags. > > best, > Colin > -- -- Guozhang
Re: [DISCUSS] KIP-778 KRaft Upgrades
On Tue, Nov 16, 2021, at 13:36, Guozhang Wang wrote: > Hi Colin, > > If we allow downgrades which would be appended in metadata.version, then > the length of the __cluster_medata log may not be safe to indicate higher > versions, since older version records could still be appended later than a > new version record right? > That's fair... the longer log could be at a lower metadata.version. However, can you think of a scenario where the upgrade / downgrade logic doesn't protect us here? Checking that a majority of the voters support the new metadata.version we're going to seems to cover all the cases I can think of. I guess there could be time-of-check, time-of-use race conditions, but they only happen if you are doing a feature downgrade at the same time as a software version downgrade. This is something the cluster management software should not do (I guess we should spell this out in the KIP) I think nobody would want to do that since it would mean rolling the cluster while you're messing with feature flags. best, Colin
Re: [DISCUSS] KIP-778 KRaft Upgrades
Hi Colin, If we allow downgrades which would be appended in metadata.version, then the length of the __cluster_medata log may not be safe to indicate higher versions, since older version records could still be appended later than a new version record right? On Tue, Nov 16, 2021 at 1:16 PM Colin McCabe wrote: > On Tue, Nov 16, 2021, at 06:36, David Arthur wrote: > > An interesting case here is how to handle a version update if a majority > of > > the quorum supports it, but the leader doesn't. For example, if C1 was > the > > leader and an upgrade to version 4 was requested. Maybe this would > trigger > > C1 to resign and inform the client to retry the update later. > > > > Hmm, wouldn't we want to just reject the version update in this case? I > don't see what the advantage of allowing it would be. > > The reason for requiring a majority rather than all voters is mainly to > cover the case where a voter is down, I thought. That clearly doesn't apply > if the un-upgraded voter is the leader itself... > > > > > We may eventually want to consider the metadata.version when electing a > > leader, but as long as we have the majority requirement before > committing a > > new metadata.version, I think we should be safe. > > > > Yeah, this is safe. If we elect a leader at metadata.version X then that > means that a majority of the cluster is at least at version X. Proof by > contradiction: assume that this is not the case. Then the newly elected > leader must have a shorter __cluster_metadata log than a majority of the > voters. But this is incompatible with winning a Raft election. > > In the case where the leader is "behind" some of the other voters, those > voters will truncate their logs to match the new leader. This will > downgrade them. Basically this is the case where the feature upgrade was > proposed, but never fully completed. > > best, > Colin > > > > -David > > > > On Mon, Nov 15, 2021 at 12:52 PM Guozhang Wang > wrote: > > > >> Thanks David, > >> > >> 1. Got it. One thing I'm still not very clear is why it's sufficient to > >> select a metadata.version which is supported by majority of the quorum, > but > >> not the whole quorum (i.e. choosing the lowest version among all the > quorum > >> members)? Since the leader election today does not take this value into > >> consideration, we are not guaranteed that newly selected leaders would > >> always be able to recognize and support the initialized metadata.version > >> right? > >> > >> 2. Yeah I think I agree the behavior-but-not-RPC-change scenario is > beyond > >> the scope of this KIP, we can defer it to later discussions. > >> > >> On Mon, Nov 15, 2021 at 8:13 AM David Arthur > >> wrote: > >> > >> > Guozhang, thanks for the review! > >> > > >> > 1, As we've defined it so far, the initial metadata.version is set by > an > >> > operator via the "kafka-storage.sh" tool. It would be possible for > >> > different values to be selected, but only the quorum leader would > commit > >> a > >> > FeatureLevelRecord with the version they read locally. See the above > >> reply > >> > to Jun's question for a little more detail. > >> > > >> > We need to enable the KRaft RPCs regardless of metadata.version (vote, > >> > heartbeat, fetch, etc) so that the quorum can be formed, a leader can > be > >> > elected, and followers can learn about the selected metadata.version. > I > >> > believe the quorum startup procedure would go something like: > >> > > >> > * Controller nodes start, read their logs, begin leader election > >> > * While the elected node is becoming leader (in > >> > QuorumMetaLogListener#handleLeaderChange), initialize > metadata.version if > >> > necessary > >> > * Followers replicate the FeatureLevelRecord > >> > > >> > This (probably) means the quorum peers must continue to rely on > >> ApiVersions > >> > to negotiate compatible RPC versions and these versions cannot depend > on > >> > metadata.version. > >> > > >> > Does this make sense? > >> > > >> > > >> > 2, ApiVersionResponse includes the broker's supported feature flags as > >> well > >> > as the cluster-wide finalized feature flags. We probably need to add > >> > something like the feature flag "epoch" to this response payload in > order > >> > to see which broker is most up to date. > >> > > >> > If the new feature flag version included RPC changes, we are helped by > >> the > >> > fact that a client won't attempt to use the new RPC until it has > >> discovered > >> > a broker that supports it via ApiVersions. The problem is more > difficult > >> > for cases like you described where the feature flag upgrade changes > the > >> > behavior of the broker, but not its RPCs. This is actually the same > >> problem > >> > as upgrading the IBP. During a rolling restart, clients may hit > different > >> > brokers with different capabilities and not know it. > >> > > >> > This probably warrants further investigation, but hopefully you agree > it > >> is > >> > beyond the scope of this KIP :) > >>
Re: [DISCUSS] KIP-778 KRaft Upgrades
Thanks, I think having the leader election to consider the metadata.version would be a good idea moving forward, but we do not need to include in this KIP. On Tue, Nov 16, 2021 at 6:37 AM David Arthur wrote: > Guozhang, > > 1. By requiring a majority of all controller nodes to support the version > selected by the leader, we increase the likelihood that the next leader > will also support it. We can't guarantee that all nodes definitely support > the selected metadata.version because there could always be an offline or > partitioned peer that is running old software. > > If a controller running old software manages elected, we hit this case: > > In the unlikely event that an active controller encounters an unsupported > > metadata.version, it should resign and terminate. > > > So, given this, we should be able to eventually elect a controller that > does support the metadata.version. > > > Consider controllers C1, C2, C3 with this arrangement: > > Node SoftwareVer MaxMetadataVer > C13.2 1 > C23.3 4 > C33.3 4 > > If the current metadata.version is 1 and we're trying to upgrade to 4, we > would allow it since two of the three nodes support it. If any one > controller is down while we are attempting an upgrade, we would require > that both of remaining alive nodes support the target metadata.version > (since we require a majority of _all_ controller nodes, not just alive > ones). > > An interesting case here is how to handle a version update if a majority of > the quorum supports it, but the leader doesn't. For example, if C1 was the > leader and an upgrade to version 4 was requested. Maybe this would trigger > C1 to resign and inform the client to retry the update later. > > We may eventually want to consider the metadata.version when electing a > leader, but as long as we have the majority requirement before committing a > new metadata.version, I think we should be safe. > > -David > > On Mon, Nov 15, 2021 at 12:52 PM Guozhang Wang wrote: > > > Thanks David, > > > > 1. Got it. One thing I'm still not very clear is why it's sufficient to > > select a metadata.version which is supported by majority of the quorum, > but > > not the whole quorum (i.e. choosing the lowest version among all the > quorum > > members)? Since the leader election today does not take this value into > > consideration, we are not guaranteed that newly selected leaders would > > always be able to recognize and support the initialized metadata.version > > right? > > > > 2. Yeah I think I agree the behavior-but-not-RPC-change scenario is > beyond > > the scope of this KIP, we can defer it to later discussions. > > > > On Mon, Nov 15, 2021 at 8:13 AM David Arthur > > wrote: > > > > > Guozhang, thanks for the review! > > > > > > 1, As we've defined it so far, the initial metadata.version is set by > an > > > operator via the "kafka-storage.sh" tool. It would be possible for > > > different values to be selected, but only the quorum leader would > commit > > a > > > FeatureLevelRecord with the version they read locally. See the above > > reply > > > to Jun's question for a little more detail. > > > > > > We need to enable the KRaft RPCs regardless of metadata.version (vote, > > > heartbeat, fetch, etc) so that the quorum can be formed, a leader can > be > > > elected, and followers can learn about the selected metadata.version. I > > > believe the quorum startup procedure would go something like: > > > > > > * Controller nodes start, read their logs, begin leader election > > > * While the elected node is becoming leader (in > > > QuorumMetaLogListener#handleLeaderChange), initialize metadata.version > if > > > necessary > > > * Followers replicate the FeatureLevelRecord > > > > > > This (probably) means the quorum peers must continue to rely on > > ApiVersions > > > to negotiate compatible RPC versions and these versions cannot depend > on > > > metadata.version. > > > > > > Does this make sense? > > > > > > > > > 2, ApiVersionResponse includes the broker's supported feature flags as > > well > > > as the cluster-wide finalized feature flags. We probably need to add > > > something like the feature flag "epoch" to this response payload in > order > > > to see which broker is most up to date. > > > > > > If the new feature flag version included RPC changes, we are helped by > > the > > > fact that a client won't attempt to use the new RPC until it has > > discovered > > > a broker that supports it via ApiVersions. The problem is more > difficult > > > for cases like you described where the feature flag upgrade changes the > > > behavior of the broker, but not its RPCs. This is actually the same > > problem > > > as upgrading the IBP. During a rolling restart, clients may hit > different > > > brokers with different capabilities and not know it. > > > > > > This probably warrants further investigation, but hopefully you agree > it > > is > > > beyond the scope of this KIP :) > > > > > > -David > > >
Re: [DISCUSS] KIP-778 KRaft Upgrades
On Tue, Nov 16, 2021, at 06:36, David Arthur wrote: > An interesting case here is how to handle a version update if a majority of > the quorum supports it, but the leader doesn't. For example, if C1 was the > leader and an upgrade to version 4 was requested. Maybe this would trigger > C1 to resign and inform the client to retry the update later. > Hmm, wouldn't we want to just reject the version update in this case? I don't see what the advantage of allowing it would be. The reason for requiring a majority rather than all voters is mainly to cover the case where a voter is down, I thought. That clearly doesn't apply if the un-upgraded voter is the leader itself... > > We may eventually want to consider the metadata.version when electing a > leader, but as long as we have the majority requirement before committing a > new metadata.version, I think we should be safe. > Yeah, this is safe. If we elect a leader at metadata.version X then that means that a majority of the cluster is at least at version X. Proof by contradiction: assume that this is not the case. Then the newly elected leader must have a shorter __cluster_metadata log than a majority of the voters. But this is incompatible with winning a Raft election. In the case where the leader is "behind" some of the other voters, those voters will truncate their logs to match the new leader. This will downgrade them. Basically this is the case where the feature upgrade was proposed, but never fully completed. best, Colin > -David > > On Mon, Nov 15, 2021 at 12:52 PM Guozhang Wang wrote: > >> Thanks David, >> >> 1. Got it. One thing I'm still not very clear is why it's sufficient to >> select a metadata.version which is supported by majority of the quorum, but >> not the whole quorum (i.e. choosing the lowest version among all the quorum >> members)? Since the leader election today does not take this value into >> consideration, we are not guaranteed that newly selected leaders would >> always be able to recognize and support the initialized metadata.version >> right? >> >> 2. Yeah I think I agree the behavior-but-not-RPC-change scenario is beyond >> the scope of this KIP, we can defer it to later discussions. >> >> On Mon, Nov 15, 2021 at 8:13 AM David Arthur >> wrote: >> >> > Guozhang, thanks for the review! >> > >> > 1, As we've defined it so far, the initial metadata.version is set by an >> > operator via the "kafka-storage.sh" tool. It would be possible for >> > different values to be selected, but only the quorum leader would commit >> a >> > FeatureLevelRecord with the version they read locally. See the above >> reply >> > to Jun's question for a little more detail. >> > >> > We need to enable the KRaft RPCs regardless of metadata.version (vote, >> > heartbeat, fetch, etc) so that the quorum can be formed, a leader can be >> > elected, and followers can learn about the selected metadata.version. I >> > believe the quorum startup procedure would go something like: >> > >> > * Controller nodes start, read their logs, begin leader election >> > * While the elected node is becoming leader (in >> > QuorumMetaLogListener#handleLeaderChange), initialize metadata.version if >> > necessary >> > * Followers replicate the FeatureLevelRecord >> > >> > This (probably) means the quorum peers must continue to rely on >> ApiVersions >> > to negotiate compatible RPC versions and these versions cannot depend on >> > metadata.version. >> > >> > Does this make sense? >> > >> > >> > 2, ApiVersionResponse includes the broker's supported feature flags as >> well >> > as the cluster-wide finalized feature flags. We probably need to add >> > something like the feature flag "epoch" to this response payload in order >> > to see which broker is most up to date. >> > >> > If the new feature flag version included RPC changes, we are helped by >> the >> > fact that a client won't attempt to use the new RPC until it has >> discovered >> > a broker that supports it via ApiVersions. The problem is more difficult >> > for cases like you described where the feature flag upgrade changes the >> > behavior of the broker, but not its RPCs. This is actually the same >> problem >> > as upgrading the IBP. During a rolling restart, clients may hit different >> > brokers with different capabilities and not know it. >> > >> > This probably warrants further investigation, but hopefully you agree it >> is >> > beyond the scope of this KIP :) >> > >> > -David >> > >> > >> > On Mon, Nov 15, 2021 at 10:26 AM David Arthur > > >> > wrote: >> > >> > > Jun, thanks for the comments! >> > > >> > > 16, When a new cluster is deployed, we don't select the highest >> available >> > > metadata.version, but rather the quorum leader picks a bootstrap >> version >> > > defined in meta.properties. As mentioned earlier, we should add >> > validation >> > > here to ensure a majority of the followers could support this version >> > > before initializing it. This would avoid a situation where a failo
Re: [DISCUSS] KIP-778 KRaft Upgrades
Hi David, On Mon, Nov 15, 2021, at 08:13, David Arthur wrote: > This (probably) means the quorum peers must continue to rely on ApiVersions > to negotiate compatible RPC versions and these versions cannot depend on > metadata.version. I agree. Can we explicitly spell out in KIP that quorum peers use ApiVersions to negotiate compatible RPC versions, rather than using metadata.version? > > If the new feature flag version included RPC changes, we are helped by the > fact that a client won't attempt to use the new RPC until it has discovered > a broker that supports it via ApiVersions. The problem is more difficult > for cases like you described where the feature flag upgrade changes the > behavior of the broker, but not its RPCs. This is actually the same problem > as upgrading the IBP. During a rolling restart, clients may hit different > brokers with different capabilities and not know it. > > This probably warrants further investigation, but hopefully you agree it is > beyond the scope of this KIP :) > Yes, this is an existing issue (at least in theory... doesn't seem to be in practice) and out of scope for this KIP. > Thinking a bit more on this, we do need to define a state where we're > running newer software, but we don't have the feature flag set. This could > happen if we were running an older IBP that did not support KIP-778. I took another look at the KIP and I see that it calls for IBP to continue to exist alongside metadata.version. In our previous offline discussions, I think we were assuming that metadata.version would replace the IBP completely. Also, there would be a 1:1 correspondence between new IBP versions and new metadata.version levels. Basically ZK-based clusters will keep using IBP, the same as they've always done, whereas KRaft-based clusters will use metadata.version instead. I actually don't think it makes sense to keep IBP if we are introducing metadata.version. I think metadata.version should replace it completely in the KRaft case, and ZK-based brokers should keep working exactly like they currently do. This avoids all the complexity described above. What do you think? Also, it seems like KIP-778 is not present on the main KIP wiki page... can you add it? best, Colin > -David > > > On Mon, Nov 15, 2021 at 10:26 AM David Arthur > wrote: > >> Jun, thanks for the comments! >> >> 16, When a new cluster is deployed, we don't select the highest available >> metadata.version, but rather the quorum leader picks a bootstrap version >> defined in meta.properties. As mentioned earlier, we should add validation >> here to ensure a majority of the followers could support this version >> before initializing it. This would avoid a situation where a failover >> results in a new leader who can't support the selected metadata.version. >> >> Thinking a bit more on this, we do need to define a state where we're >> running newer software, but we don't have the feature flag set. This could >> happen if we were running an older IBP that did not support KIP-778. >> Following on this, it doesn't seem too difficult to consider a case where >> the IBP has been upgraded, but we still have not finalized a >> metadata.version. Here are some possible version combinations (assuming >> KIP-778 is added to Kafka 3.2): >> >> Case SoftwareIBPmetadata.versioneffective version >> -- >> A 3.1 3.1- 0 software too old for >> feature flag >> B 3.2 3.1- 0 feature flag supported, >> but IBP too old >> C 3.2 3.2- 0 feature flag supported, >> but not initialized >> D 3.2 3.21 1 feature flag initialized >> to 1 (via operator or bootstrap process) >> ... >> E 3.8 3.1- 0 ...IBP too old >> F 3.8 3.2- 0 ...not initialized >> G 3.8 3.24 4 >> >> >> Here I'm defining version 0 as "no metadata.version set". So back to your >> comment, I think the KIP omits discussion of case C from the above table >> which I can amend. Does that cover your concerns, or am I missing something >> else? >> >> >> > it's inconvenient for a user to manually upgrade every feature version >> >> For this, we would probably want to extend the capabilities of KIP-584. I >> don't think anything we've discussed for KIP-778 will preclude us from >> adding some kind of auto-upgrade in the future. >> >> 21, "disable" sounds good to me. I agree "delete feature-x" sounds a bit >> weird. >> >> >> >> On Mon, Nov 8, 2021 at 8:47 PM Guozhang Wang wrote: >> >>> Hello David, >>> >>> Thanks for the very nice writeup! It helped me a lot to refresh my memory >>> on KIP-630/590/584 :) >>> >>> I just had two clarification questions after reading through the KIP: >>> >>> 1. For the initialization procedure, do we guarantee that all the quorum
Re: [DISCUSS] KIP-778 KRaft Upgrades
On Fri, Nov 5, 2021, at 08:10, David Arthur wrote: > Colin and Jun, thanks for the additional comments! > > Colin: > >> We've been talking about having an automated RPC compatibility checker > > Do we have a way to mark fields in schemas as deprecated? It can stay in > the RPC, it just complicates the logic a bit. > Hi David, The easiest thing to do is probably just to add a comment to the "description" field. Thinking about it more, I really think we should stick to the normal RPC versioning rules. Making exceptions just tends to lead to issues down the road. >> It would be nice if the active controller could validate that a majority > of the quorum could use the proposed metadata.version. The active > controller should have this information, right? If we don't have recent > information from a quorum of voters, we wouldn't be active. > > I believe we should have this information from the ApiVersionsResponse. It > would be good to do this validation to avoid a situation where a > quorum leader can't be elected due to unprocessable records. > Sounds good... can you add this to the KIP? >> Do we need delete as a command separate from downgrade? > > I think from an operator's perspective, it is nice to distinguish between > changing a feature flag and unsetting it. It might be surprising to an > operator to see the flag's version set to nothing when they requested the > downgrade to version 0 (or less). > Fair enough. If we want to go this route we should also specify whether set to 0 is legal, or whether it's necessary to use "delete". Jun's proposal of using "disable" instead of delete also seems reasonable here... > > > it seems like we should spell out that metadata.version begins at 1 in > >KRaft clusters > > I added this text: > Thanks. > > We probably also want an RPC implemented by both brokers and controllers > > that will reveal the min and max supported versions for each feature level > > supported by the server > > This is available in ApiVersionsResponse (we include the server's supported > features as well as the cluster's finalized features) Good point. > > Jun: > > 20. Indeed snapshots are not strictly necessary during an upgrade, I've > reworded this > I thought we agreed that we would do a snapshot before each upgrade so that if there was a big problem with the new metadata version, we would at least have a clean snapshot to fall back on? best, Colin > > Thanks! > David > > > On Thu, Nov 4, 2021 at 6:51 PM Jun Rao wrote: > >> Hi, David, Jose and Colin, >> >> Thanks for the reply. A few more comments. >> >> 12. It seems that we haven't updated the AdminClient accordingly? >> >> 14. "Metadata snapshot is generated and sent to the other inactive >> controllers and to brokers". I thought we wanted each broker to generate >> its own snapshot independently? If only the controller generates the >> snapshot, how do we force other brokers to pick it up? >> >> 16. If a feature version is new, one may not want to enable it immediately >> after the cluster is upgraded. However, if a feature version has been >> stable, requiring every user to run a command to upgrade to that version >> seems inconvenient. One way to improve this is for each feature to define >> one version as the default. Then, when we upgrade a cluster, we will >> automatically upgrade the feature to the default version. An admin could >> use the tool to upgrade to a version higher than the default. >> >> 20. "The quorum controller can assist with this process by generating a >> metadata snapshot after a metadata.version increase has been committed to >> the metadata log. This snapshot will be a convenient way to let broker and >> controller components rebuild their entire in-memory state following an >> upgrade." The new version of the software could read both the new and the >> old version. Is generating a new snapshot during upgrade needed? >> >> Jun >> >> >> On Wed, Nov 3, 2021 at 5:42 PM Colin McCabe wrote: >> >> > On Tue, Oct 12, 2021, at 10:34, Jun Rao wrote: >> > > Hi, David, >> > > >> > > One more comment. >> > > >> > > 16. The main reason why KIP-584 requires finalizing a feature manually >> is >> > > that in the ZK world, the controller doesn't know all brokers in a >> > cluster. >> > > A broker temporarily down is not registered in ZK. in the KRaft world, >> > the >> > > controller keeps track of all brokers, including those that are >> > temporarily >> > > down. This makes it possible for the controller to automatically >> > finalize a >> > > feature---it's safe to do so when all brokers support that feature. >> This >> > > will make the upgrade process much simpler since no manual command is >> > > required to turn on a new feature. Have we considered this? >> > > >> > > Thanks, >> > > >> > > Jun >> > >> > Hi Jun, >> > >> > I guess David commented on this point already, but I'll comment as well. >> I >> > always had the perception that users viewed rolls as potentially risky >> and >> > were looki
Re: [DISCUSS] KIP-778 KRaft Upgrades
On Fri, Nov 5, 2021, at 15:18, Jun Rao wrote: > Hi, David, > > Thanks for the reply. > > 16. My first concern is that the KIP picks up meta.version inconsistently > during the deployment. If a new cluster is started, we pick up the highest > version. If we upgrade, we leave the feature version unchanged. Hi Jun, Thanks again for taking a look. The proposed behavior in KIP-778 is consistent with how it works today. Upgrading the software is distinct from upgrading the IBP. I think it is important to keep these two operations ("upgrading IBP/metadata version" and "upgrading software version") separate. If they are coupled it will create a situation where software upgrades are difficult and dangerous. Consider a situation where you find some bug in your current software, and you want to upgrade to new software that fixes the bug. If upgrades and IBP bumps are coupled, you can't do this without also bumping the IBP, which is usually considered a high-risk change. That means that either you have to make a special build that includes only the fix (time-consuming and error-prone), live with the bug for longer, or be very conservative about ever introducing new IBP/metadata versions. None of those are really good choices. > Intuitively, it seems that independent of how a cluster is deployed, we > should always pick the same feature version. I think it makes sense to draw a distinction between upgrading an existing cluster and deploying a new one. What most people want out of upgrades is that things should keep working, but with bug fixes. If we change that, it just makes people more reluctant to upgrade (which is always a problem...) > I think we need to think this through in this KIP. My second concern is > that as a particular version matures, it's inconvenient for a user to manually > upgrade every feature version. As long as we have a path to achieve that in > the future, we don't need to address that in this KIP. If people are managing a large number of Kafka clusters, they will want to do some sort of A/B testing with IBP/metadata versions. So if you have 1000 Kafka clusters, you roll out the new IBP version to 10 of them and see how it goes. If that goes well, you roll it out to more, etc. So, the automation needs to be at the cluster management layer, not at the Kafka layer. Each Kafka cluster doesn't know how well things went in the other 999 clusters. Automatically upgrading is a bad thing for the same reason Kafka automatically upgrading its own software version would be a bad thing -- it could lead to a disruption to a sensitive cluster at the wrong time. For people who are just managing one or two Kafka clusters, automatically upgrading feature versions isn't a big burden and can be done manually. This is all consistent with how IBP works today. Also, we already have a command-line option to the feature tool which upgrades all features to the latest available, if that is what the administrator desires. Perhaps we could add documentation saying that this should be done as the last step of the upgrade. best, Colin > > 21. "./kafka-features.sh delete": Deleting a feature seems a bit weird > since the logic is always there. Would it be better to use disable? > > Jun > > On Fri, Nov 5, 2021 at 8:11 AM David Arthur > wrote: > >> Colin and Jun, thanks for the additional comments! >> >> Colin: >> >> > We've been talking about having an automated RPC compatibility checker >> >> Do we have a way to mark fields in schemas as deprecated? It can stay in >> the RPC, it just complicates the logic a bit. >> >> > It would be nice if the active controller could validate that a majority >> of the quorum could use the proposed metadata.version. The active >> controller should have this information, right? If we don't have recent >> information from a quorum of voters, we wouldn't be active. >> >> I believe we should have this information from the ApiVersionsResponse. It >> would be good to do this validation to avoid a situation where a >> quorum leader can't be elected due to unprocessable records. >> >> > Do we need delete as a command separate from downgrade? >> >> I think from an operator's perspective, it is nice to distinguish between >> changing a feature flag and unsetting it. It might be surprising to an >> operator to see the flag's version set to nothing when they requested the >> downgrade to version 0 (or less). >> >> > it seems like we should spell out that metadata.version begins at 1 in >> KRaft clusters >> >> I added this text: >> >> Introduce an IBP version to indicate the lowest software version that >> > supports *metadata.version*. Below this IBP, the *metadata.version* is >> > undefined and will not be examined. At or above this IBP, the >> > *metadata.version* must be *0* for ZooKeeper clusters and will be >> > initialized as *1* for KRaft clusters. >> >> >> > We probably also want an RPC implemented by both brokers and controllers >> that will reveal the min
Re: [DISCUSS] KIP-778 KRaft Upgrades
Guozhang, 1. By requiring a majority of all controller nodes to support the version selected by the leader, we increase the likelihood that the next leader will also support it. We can't guarantee that all nodes definitely support the selected metadata.version because there could always be an offline or partitioned peer that is running old software. If a controller running old software manages elected, we hit this case: In the unlikely event that an active controller encounters an unsupported > metadata.version, it should resign and terminate. So, given this, we should be able to eventually elect a controller that does support the metadata.version. Consider controllers C1, C2, C3 with this arrangement: Node SoftwareVer MaxMetadataVer C13.2 1 C23.3 4 C33.3 4 If the current metadata.version is 1 and we're trying to upgrade to 4, we would allow it since two of the three nodes support it. If any one controller is down while we are attempting an upgrade, we would require that both of remaining alive nodes support the target metadata.version (since we require a majority of _all_ controller nodes, not just alive ones). An interesting case here is how to handle a version update if a majority of the quorum supports it, but the leader doesn't. For example, if C1 was the leader and an upgrade to version 4 was requested. Maybe this would trigger C1 to resign and inform the client to retry the update later. We may eventually want to consider the metadata.version when electing a leader, but as long as we have the majority requirement before committing a new metadata.version, I think we should be safe. -David On Mon, Nov 15, 2021 at 12:52 PM Guozhang Wang wrote: > Thanks David, > > 1. Got it. One thing I'm still not very clear is why it's sufficient to > select a metadata.version which is supported by majority of the quorum, but > not the whole quorum (i.e. choosing the lowest version among all the quorum > members)? Since the leader election today does not take this value into > consideration, we are not guaranteed that newly selected leaders would > always be able to recognize and support the initialized metadata.version > right? > > 2. Yeah I think I agree the behavior-but-not-RPC-change scenario is beyond > the scope of this KIP, we can defer it to later discussions. > > On Mon, Nov 15, 2021 at 8:13 AM David Arthur > wrote: > > > Guozhang, thanks for the review! > > > > 1, As we've defined it so far, the initial metadata.version is set by an > > operator via the "kafka-storage.sh" tool. It would be possible for > > different values to be selected, but only the quorum leader would commit > a > > FeatureLevelRecord with the version they read locally. See the above > reply > > to Jun's question for a little more detail. > > > > We need to enable the KRaft RPCs regardless of metadata.version (vote, > > heartbeat, fetch, etc) so that the quorum can be formed, a leader can be > > elected, and followers can learn about the selected metadata.version. I > > believe the quorum startup procedure would go something like: > > > > * Controller nodes start, read their logs, begin leader election > > * While the elected node is becoming leader (in > > QuorumMetaLogListener#handleLeaderChange), initialize metadata.version if > > necessary > > * Followers replicate the FeatureLevelRecord > > > > This (probably) means the quorum peers must continue to rely on > ApiVersions > > to negotiate compatible RPC versions and these versions cannot depend on > > metadata.version. > > > > Does this make sense? > > > > > > 2, ApiVersionResponse includes the broker's supported feature flags as > well > > as the cluster-wide finalized feature flags. We probably need to add > > something like the feature flag "epoch" to this response payload in order > > to see which broker is most up to date. > > > > If the new feature flag version included RPC changes, we are helped by > the > > fact that a client won't attempt to use the new RPC until it has > discovered > > a broker that supports it via ApiVersions. The problem is more difficult > > for cases like you described where the feature flag upgrade changes the > > behavior of the broker, but not its RPCs. This is actually the same > problem > > as upgrading the IBP. During a rolling restart, clients may hit different > > brokers with different capabilities and not know it. > > > > This probably warrants further investigation, but hopefully you agree it > is > > beyond the scope of this KIP :) > > > > -David > > > > > > On Mon, Nov 15, 2021 at 10:26 AM David Arthur > > > wrote: > > > > > Jun, thanks for the comments! > > > > > > 16, When a new cluster is deployed, we don't select the highest > available > > > metadata.version, but rather the quorum leader picks a bootstrap > version > > > defined in meta.properties. As mentioned earlier, we should add > > validation > > > here to ensure a majority of the followers could support this version > > > bef
Re: [DISCUSS] KIP-778 KRaft Upgrades
Thanks David, 1. Got it. One thing I'm still not very clear is why it's sufficient to select a metadata.version which is supported by majority of the quorum, but not the whole quorum (i.e. choosing the lowest version among all the quorum members)? Since the leader election today does not take this value into consideration, we are not guaranteed that newly selected leaders would always be able to recognize and support the initialized metadata.version right? 2. Yeah I think I agree the behavior-but-not-RPC-change scenario is beyond the scope of this KIP, we can defer it to later discussions. On Mon, Nov 15, 2021 at 8:13 AM David Arthur wrote: > Guozhang, thanks for the review! > > 1, As we've defined it so far, the initial metadata.version is set by an > operator via the "kafka-storage.sh" tool. It would be possible for > different values to be selected, but only the quorum leader would commit a > FeatureLevelRecord with the version they read locally. See the above reply > to Jun's question for a little more detail. > > We need to enable the KRaft RPCs regardless of metadata.version (vote, > heartbeat, fetch, etc) so that the quorum can be formed, a leader can be > elected, and followers can learn about the selected metadata.version. I > believe the quorum startup procedure would go something like: > > * Controller nodes start, read their logs, begin leader election > * While the elected node is becoming leader (in > QuorumMetaLogListener#handleLeaderChange), initialize metadata.version if > necessary > * Followers replicate the FeatureLevelRecord > > This (probably) means the quorum peers must continue to rely on ApiVersions > to negotiate compatible RPC versions and these versions cannot depend on > metadata.version. > > Does this make sense? > > > 2, ApiVersionResponse includes the broker's supported feature flags as well > as the cluster-wide finalized feature flags. We probably need to add > something like the feature flag "epoch" to this response payload in order > to see which broker is most up to date. > > If the new feature flag version included RPC changes, we are helped by the > fact that a client won't attempt to use the new RPC until it has discovered > a broker that supports it via ApiVersions. The problem is more difficult > for cases like you described where the feature flag upgrade changes the > behavior of the broker, but not its RPCs. This is actually the same problem > as upgrading the IBP. During a rolling restart, clients may hit different > brokers with different capabilities and not know it. > > This probably warrants further investigation, but hopefully you agree it is > beyond the scope of this KIP :) > > -David > > > On Mon, Nov 15, 2021 at 10:26 AM David Arthur > wrote: > > > Jun, thanks for the comments! > > > > 16, When a new cluster is deployed, we don't select the highest available > > metadata.version, but rather the quorum leader picks a bootstrap version > > defined in meta.properties. As mentioned earlier, we should add > validation > > here to ensure a majority of the followers could support this version > > before initializing it. This would avoid a situation where a failover > > results in a new leader who can't support the selected metadata.version. > > > > Thinking a bit more on this, we do need to define a state where we're > > running newer software, but we don't have the feature flag set. This > could > > happen if we were running an older IBP that did not support KIP-778. > > Following on this, it doesn't seem too difficult to consider a case where > > the IBP has been upgraded, but we still have not finalized a > > metadata.version. Here are some possible version combinations (assuming > > KIP-778 is added to Kafka 3.2): > > > > Case SoftwareIBPmetadata.versioneffective version > > -- > > A 3.1 3.1- 0 software too old for > > feature flag > > B 3.2 3.1- 0 feature flag supported, > > but IBP too old > > C 3.2 3.2- 0 feature flag supported, > > but not initialized > > D 3.2 3.21 1 feature flag initialized > > to 1 (via operator or bootstrap process) > > ... > > E 3.8 3.1- 0 ...IBP too old > > F 3.8 3.2- 0 ...not initialized > > G 3.8 3.24 4 > > > > > > Here I'm defining version 0 as "no metadata.version set". So back to your > > comment, I think the KIP omits discussion of case C from the above table > > which I can amend. Does that cover your concerns, or am I missing > something > > else? > > > > > > > it's inconvenient for a user to manually upgrade every feature version > > > > For this, we would probably want to extend the capabilities of KIP-584. I > > don't think anything we've discussed for KIP-778 will preclude us from > > adding some kind
Re: [DISCUSS] KIP-778 KRaft Upgrades
Guozhang, thanks for the review! 1, As we've defined it so far, the initial metadata.version is set by an operator via the "kafka-storage.sh" tool. It would be possible for different values to be selected, but only the quorum leader would commit a FeatureLevelRecord with the version they read locally. See the above reply to Jun's question for a little more detail. We need to enable the KRaft RPCs regardless of metadata.version (vote, heartbeat, fetch, etc) so that the quorum can be formed, a leader can be elected, and followers can learn about the selected metadata.version. I believe the quorum startup procedure would go something like: * Controller nodes start, read their logs, begin leader election * While the elected node is becoming leader (in QuorumMetaLogListener#handleLeaderChange), initialize metadata.version if necessary * Followers replicate the FeatureLevelRecord This (probably) means the quorum peers must continue to rely on ApiVersions to negotiate compatible RPC versions and these versions cannot depend on metadata.version. Does this make sense? 2, ApiVersionResponse includes the broker's supported feature flags as well as the cluster-wide finalized feature flags. We probably need to add something like the feature flag "epoch" to this response payload in order to see which broker is most up to date. If the new feature flag version included RPC changes, we are helped by the fact that a client won't attempt to use the new RPC until it has discovered a broker that supports it via ApiVersions. The problem is more difficult for cases like you described where the feature flag upgrade changes the behavior of the broker, but not its RPCs. This is actually the same problem as upgrading the IBP. During a rolling restart, clients may hit different brokers with different capabilities and not know it. This probably warrants further investigation, but hopefully you agree it is beyond the scope of this KIP :) -David On Mon, Nov 15, 2021 at 10:26 AM David Arthur wrote: > Jun, thanks for the comments! > > 16, When a new cluster is deployed, we don't select the highest available > metadata.version, but rather the quorum leader picks a bootstrap version > defined in meta.properties. As mentioned earlier, we should add validation > here to ensure a majority of the followers could support this version > before initializing it. This would avoid a situation where a failover > results in a new leader who can't support the selected metadata.version. > > Thinking a bit more on this, we do need to define a state where we're > running newer software, but we don't have the feature flag set. This could > happen if we were running an older IBP that did not support KIP-778. > Following on this, it doesn't seem too difficult to consider a case where > the IBP has been upgraded, but we still have not finalized a > metadata.version. Here are some possible version combinations (assuming > KIP-778 is added to Kafka 3.2): > > Case SoftwareIBPmetadata.versioneffective version > -- > A 3.1 3.1- 0 software too old for > feature flag > B 3.2 3.1- 0 feature flag supported, > but IBP too old > C 3.2 3.2- 0 feature flag supported, > but not initialized > D 3.2 3.21 1 feature flag initialized > to 1 (via operator or bootstrap process) > ... > E 3.8 3.1- 0 ...IBP too old > F 3.8 3.2- 0 ...not initialized > G 3.8 3.24 4 > > > Here I'm defining version 0 as "no metadata.version set". So back to your > comment, I think the KIP omits discussion of case C from the above table > which I can amend. Does that cover your concerns, or am I missing something > else? > > > > it's inconvenient for a user to manually upgrade every feature version > > For this, we would probably want to extend the capabilities of KIP-584. I > don't think anything we've discussed for KIP-778 will preclude us from > adding some kind of auto-upgrade in the future. > > 21, "disable" sounds good to me. I agree "delete feature-x" sounds a bit > weird. > > > > On Mon, Nov 8, 2021 at 8:47 PM Guozhang Wang wrote: > >> Hello David, >> >> Thanks for the very nice writeup! It helped me a lot to refresh my memory >> on KIP-630/590/584 :) >> >> I just had two clarification questions after reading through the KIP: >> >> 1. For the initialization procedure, do we guarantee that all the quorum >> nodes (inactive candidates and leaders, a.k.a. controllers) would always >> initialize with the same metadata.version? If yes, how is that guaranteed? >> More specifically, when a quorum candidate is starting up, would it avoid >> handling any controller requests (including APIVersionsRequest) from its >> peers in the quorum until it completes reading the local log? An
Re: [DISCUSS] KIP-778 KRaft Upgrades
Jun, thanks for the comments! 16, When a new cluster is deployed, we don't select the highest available metadata.version, but rather the quorum leader picks a bootstrap version defined in meta.properties. As mentioned earlier, we should add validation here to ensure a majority of the followers could support this version before initializing it. This would avoid a situation where a failover results in a new leader who can't support the selected metadata.version. Thinking a bit more on this, we do need to define a state where we're running newer software, but we don't have the feature flag set. This could happen if we were running an older IBP that did not support KIP-778. Following on this, it doesn't seem too difficult to consider a case where the IBP has been upgraded, but we still have not finalized a metadata.version. Here are some possible version combinations (assuming KIP-778 is added to Kafka 3.2): Case SoftwareIBPmetadata.versioneffective version -- A 3.1 3.1- 0 software too old for feature flag B 3.2 3.1- 0 feature flag supported, but IBP too old C 3.2 3.2- 0 feature flag supported, but not initialized D 3.2 3.21 1 feature flag initialized to 1 (via operator or bootstrap process) ... E 3.8 3.1- 0 ...IBP too old F 3.8 3.2- 0 ...not initialized G 3.8 3.24 4 Here I'm defining version 0 as "no metadata.version set". So back to your comment, I think the KIP omits discussion of case C from the above table which I can amend. Does that cover your concerns, or am I missing something else? > it's inconvenient for a user to manually upgrade every feature version For this, we would probably want to extend the capabilities of KIP-584. I don't think anything we've discussed for KIP-778 will preclude us from adding some kind of auto-upgrade in the future. 21, "disable" sounds good to me. I agree "delete feature-x" sounds a bit weird. On Mon, Nov 8, 2021 at 8:47 PM Guozhang Wang wrote: > Hello David, > > Thanks for the very nice writeup! It helped me a lot to refresh my memory > on KIP-630/590/584 :) > > I just had two clarification questions after reading through the KIP: > > 1. For the initialization procedure, do we guarantee that all the quorum > nodes (inactive candidates and leaders, a.k.a. controllers) would always > initialize with the same metadata.version? If yes, how is that guaranteed? > More specifically, when a quorum candidate is starting up, would it avoid > handling any controller requests (including APIVersionsRequest) from its > peers in the quorum until it completes reading the local log? And even if > yes, what would happen if there's no FeatureLevelRecord found, and > different nodes read different values from their local meta.properties file > or initializing from their binary's hard-code values? > > 2. This is not only for the metadata.version itself, but for general > feature.versions: when a version is upgraded / downgraded, since brokers > would read the FeatureLevelRecord at their own pace, there will always be a > race window when some brokers has processed the record and completed the > upgrade while others have not, hence may behave differently --- I'm > thinking for the future like the specific replica selector to allow > fetching from follower, and even more advanced selectors --- i.e. should we > consider letting clients to only talk to brokers with the highest metadata > log offsets for example? > > > Guozhang > > > > > On Fri, Nov 5, 2021 at 3:18 PM Jun Rao wrote: > > > Hi, David, > > > > Thanks for the reply. > > > > 16. My first concern is that the KIP picks up meta.version inconsistently > > during the deployment. If a new cluster is started, we pick up the > highest > > version. If we upgrade, we leave the feature version unchanged. > > Intuitively, it seems that independent of how a cluster is deployed, we > > should always pick the same feature version. I think we need to think > > this through in this KIP. My second concern is that as a particular > version > > matures, it's inconvenient for a user to manually upgrade every feature > > version. As long as we have a path to achieve that in the future, we > don't > > need to address that in this KIP. > > > > 21. "./kafka-features.sh delete": Deleting a feature seems a bit weird > > since the logic is always there. Would it be better to use disable? > > > > Jun > > > > On Fri, Nov 5, 2021 at 8:11 AM David Arthur > > wrote: > > > > > Colin and Jun, thanks for the additional comments! > > > > > > Colin: > > > > > > > We've been talking about having an automated RPC compatibility > checker > > > > > > Do we have a way to mark fields in schemas as deprecated? It can stay > in > > > the RPC, it just com
Re: [DISCUSS] KIP-778 KRaft Upgrades
Hello David, Thanks for the very nice writeup! It helped me a lot to refresh my memory on KIP-630/590/584 :) I just had two clarification questions after reading through the KIP: 1. For the initialization procedure, do we guarantee that all the quorum nodes (inactive candidates and leaders, a.k.a. controllers) would always initialize with the same metadata.version? If yes, how is that guaranteed? More specifically, when a quorum candidate is starting up, would it avoid handling any controller requests (including APIVersionsRequest) from its peers in the quorum until it completes reading the local log? And even if yes, what would happen if there's no FeatureLevelRecord found, and different nodes read different values from their local meta.properties file or initializing from their binary's hard-code values? 2. This is not only for the metadata.version itself, but for general feature.versions: when a version is upgraded / downgraded, since brokers would read the FeatureLevelRecord at their own pace, there will always be a race window when some brokers has processed the record and completed the upgrade while others have not, hence may behave differently --- I'm thinking for the future like the specific replica selector to allow fetching from follower, and even more advanced selectors --- i.e. should we consider letting clients to only talk to brokers with the highest metadata log offsets for example? Guozhang On Fri, Nov 5, 2021 at 3:18 PM Jun Rao wrote: > Hi, David, > > Thanks for the reply. > > 16. My first concern is that the KIP picks up meta.version inconsistently > during the deployment. If a new cluster is started, we pick up the highest > version. If we upgrade, we leave the feature version unchanged. > Intuitively, it seems that independent of how a cluster is deployed, we > should always pick the same feature version. I think we need to think > this through in this KIP. My second concern is that as a particular version > matures, it's inconvenient for a user to manually upgrade every feature > version. As long as we have a path to achieve that in the future, we don't > need to address that in this KIP. > > 21. "./kafka-features.sh delete": Deleting a feature seems a bit weird > since the logic is always there. Would it be better to use disable? > > Jun > > On Fri, Nov 5, 2021 at 8:11 AM David Arthur > wrote: > > > Colin and Jun, thanks for the additional comments! > > > > Colin: > > > > > We've been talking about having an automated RPC compatibility checker > > > > Do we have a way to mark fields in schemas as deprecated? It can stay in > > the RPC, it just complicates the logic a bit. > > > > > It would be nice if the active controller could validate that a > majority > > of the quorum could use the proposed metadata.version. The active > > controller should have this information, right? If we don't have recent > > information from a quorum of voters, we wouldn't be active. > > > > I believe we should have this information from the ApiVersionsResponse. > It > > would be good to do this validation to avoid a situation where a > > quorum leader can't be elected due to unprocessable records. > > > > > Do we need delete as a command separate from downgrade? > > > > I think from an operator's perspective, it is nice to distinguish between > > changing a feature flag and unsetting it. It might be surprising to an > > operator to see the flag's version set to nothing when they requested the > > downgrade to version 0 (or less). > > > > > it seems like we should spell out that metadata.version begins at 1 in > > KRaft clusters > > > > I added this text: > > > > Introduce an IBP version to indicate the lowest software version that > > > supports *metadata.version*. Below this IBP, the *metadata.version* is > > > undefined and will not be examined. At or above this IBP, the > > > *metadata.version* must be *0* for ZooKeeper clusters and will be > > > initialized as *1* for KRaft clusters. > > > > > > > We probably also want an RPC implemented by both brokers and > controllers > > that will reveal the min and max supported versions for each feature > level > > supported by the server > > > > This is available in ApiVersionsResponse (we include the server's > supported > > features as well as the cluster's finalized features) > > > > > > > > Jun: > > > > 12. I've updated the KIP with AdminClient changes > > > > 14. You're right, it looks like I missed a few sections regarding > snapshot > > generation. I've corrected it > > > > 16. This feels more like an enhancement to KIP-584. I agree it could be > > useful, but perhaps we could address it separately from KRaft upgrades? > > > > 20. Indeed snapshots are not strictly necessary during an upgrade, I've > > reworded this > > > > > > Thanks! > > David > > > > > > On Thu, Nov 4, 2021 at 6:51 PM Jun Rao wrote: > > > > > Hi, David, Jose and Colin, > > > > > > Thanks for the reply. A few more comments. > > > > > > 12. It seems that we haven't
Re: [DISCUSS] KIP-778 KRaft Upgrades
Hi, David, Thanks for the reply. 16. My first concern is that the KIP picks up meta.version inconsistently during the deployment. If a new cluster is started, we pick up the highest version. If we upgrade, we leave the feature version unchanged. Intuitively, it seems that independent of how a cluster is deployed, we should always pick the same feature version. I think we need to think this through in this KIP. My second concern is that as a particular version matures, it's inconvenient for a user to manually upgrade every feature version. As long as we have a path to achieve that in the future, we don't need to address that in this KIP. 21. "./kafka-features.sh delete": Deleting a feature seems a bit weird since the logic is always there. Would it be better to use disable? Jun On Fri, Nov 5, 2021 at 8:11 AM David Arthur wrote: > Colin and Jun, thanks for the additional comments! > > Colin: > > > We've been talking about having an automated RPC compatibility checker > > Do we have a way to mark fields in schemas as deprecated? It can stay in > the RPC, it just complicates the logic a bit. > > > It would be nice if the active controller could validate that a majority > of the quorum could use the proposed metadata.version. The active > controller should have this information, right? If we don't have recent > information from a quorum of voters, we wouldn't be active. > > I believe we should have this information from the ApiVersionsResponse. It > would be good to do this validation to avoid a situation where a > quorum leader can't be elected due to unprocessable records. > > > Do we need delete as a command separate from downgrade? > > I think from an operator's perspective, it is nice to distinguish between > changing a feature flag and unsetting it. It might be surprising to an > operator to see the flag's version set to nothing when they requested the > downgrade to version 0 (or less). > > > it seems like we should spell out that metadata.version begins at 1 in > KRaft clusters > > I added this text: > > Introduce an IBP version to indicate the lowest software version that > > supports *metadata.version*. Below this IBP, the *metadata.version* is > > undefined and will not be examined. At or above this IBP, the > > *metadata.version* must be *0* for ZooKeeper clusters and will be > > initialized as *1* for KRaft clusters. > > > > We probably also want an RPC implemented by both brokers and controllers > that will reveal the min and max supported versions for each feature level > supported by the server > > This is available in ApiVersionsResponse (we include the server's supported > features as well as the cluster's finalized features) > > > > Jun: > > 12. I've updated the KIP with AdminClient changes > > 14. You're right, it looks like I missed a few sections regarding snapshot > generation. I've corrected it > > 16. This feels more like an enhancement to KIP-584. I agree it could be > useful, but perhaps we could address it separately from KRaft upgrades? > > 20. Indeed snapshots are not strictly necessary during an upgrade, I've > reworded this > > > Thanks! > David > > > On Thu, Nov 4, 2021 at 6:51 PM Jun Rao wrote: > > > Hi, David, Jose and Colin, > > > > Thanks for the reply. A few more comments. > > > > 12. It seems that we haven't updated the AdminClient accordingly? > > > > 14. "Metadata snapshot is generated and sent to the other inactive > > controllers and to brokers". I thought we wanted each broker to generate > > its own snapshot independently? If only the controller generates the > > snapshot, how do we force other brokers to pick it up? > > > > 16. If a feature version is new, one may not want to enable it > immediately > > after the cluster is upgraded. However, if a feature version has been > > stable, requiring every user to run a command to upgrade to that version > > seems inconvenient. One way to improve this is for each feature to define > > one version as the default. Then, when we upgrade a cluster, we will > > automatically upgrade the feature to the default version. An admin could > > use the tool to upgrade to a version higher than the default. > > > > 20. "The quorum controller can assist with this process by generating a > > metadata snapshot after a metadata.version increase has been committed to > > the metadata log. This snapshot will be a convenient way to let broker > and > > controller components rebuild their entire in-memory state following an > > upgrade." The new version of the software could read both the new and the > > old version. Is generating a new snapshot during upgrade needed? > > > > Jun > > > > > > On Wed, Nov 3, 2021 at 5:42 PM Colin McCabe wrote: > > > > > On Tue, Oct 12, 2021, at 10:34, Jun Rao wrote: > > > > Hi, David, > > > > > > > > One more comment. > > > > > > > > 16. The main reason why KIP-584 requires finalizing a feature > manually > > is > > > > that in the ZK world, the controller doesn't know all brokers in a > > > clu
Re: [DISCUSS] KIP-778 KRaft Upgrades
Colin and Jun, thanks for the additional comments! Colin: > We've been talking about having an automated RPC compatibility checker Do we have a way to mark fields in schemas as deprecated? It can stay in the RPC, it just complicates the logic a bit. > It would be nice if the active controller could validate that a majority of the quorum could use the proposed metadata.version. The active controller should have this information, right? If we don't have recent information from a quorum of voters, we wouldn't be active. I believe we should have this information from the ApiVersionsResponse. It would be good to do this validation to avoid a situation where a quorum leader can't be elected due to unprocessable records. > Do we need delete as a command separate from downgrade? I think from an operator's perspective, it is nice to distinguish between changing a feature flag and unsetting it. It might be surprising to an operator to see the flag's version set to nothing when they requested the downgrade to version 0 (or less). > it seems like we should spell out that metadata.version begins at 1 in KRaft clusters I added this text: Introduce an IBP version to indicate the lowest software version that > supports *metadata.version*. Below this IBP, the *metadata.version* is > undefined and will not be examined. At or above this IBP, the > *metadata.version* must be *0* for ZooKeeper clusters and will be > initialized as *1* for KRaft clusters. > We probably also want an RPC implemented by both brokers and controllers that will reveal the min and max supported versions for each feature level supported by the server This is available in ApiVersionsResponse (we include the server's supported features as well as the cluster's finalized features) Jun: 12. I've updated the KIP with AdminClient changes 14. You're right, it looks like I missed a few sections regarding snapshot generation. I've corrected it 16. This feels more like an enhancement to KIP-584. I agree it could be useful, but perhaps we could address it separately from KRaft upgrades? 20. Indeed snapshots are not strictly necessary during an upgrade, I've reworded this Thanks! David On Thu, Nov 4, 2021 at 6:51 PM Jun Rao wrote: > Hi, David, Jose and Colin, > > Thanks for the reply. A few more comments. > > 12. It seems that we haven't updated the AdminClient accordingly? > > 14. "Metadata snapshot is generated and sent to the other inactive > controllers and to brokers". I thought we wanted each broker to generate > its own snapshot independently? If only the controller generates the > snapshot, how do we force other brokers to pick it up? > > 16. If a feature version is new, one may not want to enable it immediately > after the cluster is upgraded. However, if a feature version has been > stable, requiring every user to run a command to upgrade to that version > seems inconvenient. One way to improve this is for each feature to define > one version as the default. Then, when we upgrade a cluster, we will > automatically upgrade the feature to the default version. An admin could > use the tool to upgrade to a version higher than the default. > > 20. "The quorum controller can assist with this process by generating a > metadata snapshot after a metadata.version increase has been committed to > the metadata log. This snapshot will be a convenient way to let broker and > controller components rebuild their entire in-memory state following an > upgrade." The new version of the software could read both the new and the > old version. Is generating a new snapshot during upgrade needed? > > Jun > > > On Wed, Nov 3, 2021 at 5:42 PM Colin McCabe wrote: > > > On Tue, Oct 12, 2021, at 10:34, Jun Rao wrote: > > > Hi, David, > > > > > > One more comment. > > > > > > 16. The main reason why KIP-584 requires finalizing a feature manually > is > > > that in the ZK world, the controller doesn't know all brokers in a > > cluster. > > > A broker temporarily down is not registered in ZK. in the KRaft world, > > the > > > controller keeps track of all brokers, including those that are > > temporarily > > > down. This makes it possible for the controller to automatically > > finalize a > > > feature---it's safe to do so when all brokers support that feature. > This > > > will make the upgrade process much simpler since no manual command is > > > required to turn on a new feature. Have we considered this? > > > > > > Thanks, > > > > > > Jun > > > > Hi Jun, > > > > I guess David commented on this point already, but I'll comment as well. > I > > always had the perception that users viewed rolls as potentially risky > and > > were looking for ways to reduce the risk. Not enabling features right > away > > after installing new software seems like one way to do that. If we had a > > feature to automatically upgrade during a roll, I'm not sure that I would > > recommend that people use it, because if something fails, it makes it > > harder to tell if the
Re: [DISCUSS] KIP-778 KRaft Upgrades
Hi, David, Jose and Colin, Thanks for the reply. A few more comments. 12. It seems that we haven't updated the AdminClient accordingly? 14. "Metadata snapshot is generated and sent to the other inactive controllers and to brokers". I thought we wanted each broker to generate its own snapshot independently? If only the controller generates the snapshot, how do we force other brokers to pick it up? 16. If a feature version is new, one may not want to enable it immediately after the cluster is upgraded. However, if a feature version has been stable, requiring every user to run a command to upgrade to that version seems inconvenient. One way to improve this is for each feature to define one version as the default. Then, when we upgrade a cluster, we will automatically upgrade the feature to the default version. An admin could use the tool to upgrade to a version higher than the default. 20. "The quorum controller can assist with this process by generating a metadata snapshot after a metadata.version increase has been committed to the metadata log. This snapshot will be a convenient way to let broker and controller components rebuild their entire in-memory state following an upgrade." The new version of the software could read both the new and the old version. Is generating a new snapshot during upgrade needed? Jun On Wed, Nov 3, 2021 at 5:42 PM Colin McCabe wrote: > On Tue, Oct 12, 2021, at 10:34, Jun Rao wrote: > > Hi, David, > > > > One more comment. > > > > 16. The main reason why KIP-584 requires finalizing a feature manually is > > that in the ZK world, the controller doesn't know all brokers in a > cluster. > > A broker temporarily down is not registered in ZK. in the KRaft world, > the > > controller keeps track of all brokers, including those that are > temporarily > > down. This makes it possible for the controller to automatically > finalize a > > feature---it's safe to do so when all brokers support that feature. This > > will make the upgrade process much simpler since no manual command is > > required to turn on a new feature. Have we considered this? > > > > Thanks, > > > > Jun > > Hi Jun, > > I guess David commented on this point already, but I'll comment as well. I > always had the perception that users viewed rolls as potentially risky and > were looking for ways to reduce the risk. Not enabling features right away > after installing new software seems like one way to do that. If we had a > feature to automatically upgrade during a roll, I'm not sure that I would > recommend that people use it, because if something fails, it makes it > harder to tell if the new feature is at fault, or something else in the new > software. > > We already tell users to do a "double roll" when going to a new IBP. (Just > to give background to people who haven't heard that phrase, the first roll > installs the new software, and the second roll updates the IBP). So this > KIP-778 mechanism is basically very similar to that, except the second > thing isn't a roll, but just an upgrade command. So I think this is > consistent with what we currently do. > > Also, just like David said, we can always add auto-upgrade later if there > is demand... > > best, > Colin > > > > > > On Thu, Oct 7, 2021 at 5:19 PM Jun Rao wrote: > > > >> Hi, David, > >> > >> Thanks for the KIP. A few comments below. > >> > >> 10. It would be useful to describe how the controller node determines > the > >> RPC version used to communicate to other controller nodes. There seems > to > >> be a bootstrap problem. A controller node can't read the log and > >> therefore the feature level until a quorum leader is elected. But leader > >> election requires an RPC. > >> > >> 11. For downgrades, it would be useful to describe how to determine the > >> downgrade process (generating new snapshot, propagating the snapshot, > etc) > >> has completed. We could block the UpdateFeature request until the > process > >> is completed. However, since the process could take time, the request > could > >> time out. Another way is through DescribeFeature and the server only > >> reports downgraded versions after the process is completed. > >> > >> 12. Since we are changing UpdateFeaturesRequest, do we need to change > the > >> AdminClient api for updateFeatures too? > >> > >> 13. For the paragraph starting with "In the absence of an operator > >> defined value for metadata.version", in KIP-584, we described how to > >> finalize features with New cluster bootstrap. In that case, it's > >> inconvenient for the users to have to run an admin tool to finalize the > >> version for each feature. Instead, the system detects that the /features > >> path is missing in ZK and thus automatically finalizes every feature > with > >> the latest supported version. Could we do something similar in the KRaft > >> mode? > >> > >> 14. After the quorum leader generates a new snapshot, how do we force > >> other nodes to pick up the new snapshot? > >> > >> 15. I agree with Jose th
Re: [DISCUSS] KIP-778 KRaft Upgrades
On Tue, Oct 12, 2021, at 10:34, Jun Rao wrote: > Hi, David, > > One more comment. > > 16. The main reason why KIP-584 requires finalizing a feature manually is > that in the ZK world, the controller doesn't know all brokers in a cluster. > A broker temporarily down is not registered in ZK. in the KRaft world, the > controller keeps track of all brokers, including those that are temporarily > down. This makes it possible for the controller to automatically finalize a > feature---it's safe to do so when all brokers support that feature. This > will make the upgrade process much simpler since no manual command is > required to turn on a new feature. Have we considered this? > > Thanks, > > Jun Hi Jun, I guess David commented on this point already, but I'll comment as well. I always had the perception that users viewed rolls as potentially risky and were looking for ways to reduce the risk. Not enabling features right away after installing new software seems like one way to do that. If we had a feature to automatically upgrade during a roll, I'm not sure that I would recommend that people use it, because if something fails, it makes it harder to tell if the new feature is at fault, or something else in the new software. We already tell users to do a "double roll" when going to a new IBP. (Just to give background to people who haven't heard that phrase, the first roll installs the new software, and the second roll updates the IBP). So this KIP-778 mechanism is basically very similar to that, except the second thing isn't a roll, but just an upgrade command. So I think this is consistent with what we currently do. Also, just like David said, we can always add auto-upgrade later if there is demand... best, Colin > > On Thu, Oct 7, 2021 at 5:19 PM Jun Rao wrote: > >> Hi, David, >> >> Thanks for the KIP. A few comments below. >> >> 10. It would be useful to describe how the controller node determines the >> RPC version used to communicate to other controller nodes. There seems to >> be a bootstrap problem. A controller node can't read the log and >> therefore the feature level until a quorum leader is elected. But leader >> election requires an RPC. >> >> 11. For downgrades, it would be useful to describe how to determine the >> downgrade process (generating new snapshot, propagating the snapshot, etc) >> has completed. We could block the UpdateFeature request until the process >> is completed. However, since the process could take time, the request could >> time out. Another way is through DescribeFeature and the server only >> reports downgraded versions after the process is completed. >> >> 12. Since we are changing UpdateFeaturesRequest, do we need to change the >> AdminClient api for updateFeatures too? >> >> 13. For the paragraph starting with "In the absence of an operator >> defined value for metadata.version", in KIP-584, we described how to >> finalize features with New cluster bootstrap. In that case, it's >> inconvenient for the users to have to run an admin tool to finalize the >> version for each feature. Instead, the system detects that the /features >> path is missing in ZK and thus automatically finalizes every feature with >> the latest supported version. Could we do something similar in the KRaft >> mode? >> >> 14. After the quorum leader generates a new snapshot, how do we force >> other nodes to pick up the new snapshot? >> >> 15. I agree with Jose that it will be useful to describe when generating a >> new snapshot is needed. To me, it seems the new snapshot is only needed >> when incompatible changes are made. >> >> 7. Jose, what control records were you referring? >> >> Thanks, >> >> Jun >> >> >> On Tue, Oct 5, 2021 at 8:53 AM David Arthur >> wrote: >> >>> Jose, thanks for the thorough review and comments! >>> >>> I am out of the office until next week, so I probably won't be able to >>> update the KIP until then. Here are some replies to your questions: >>> >>> 1. Generate snapshot on upgrade >>> > > Metadata snapshot is generated and sent to the other nodes >>> > Why does the Active Controller need to generate a new snapshot and >>> > force a snapshot fetch from the replicas (inactive controller and >>> > brokers) on an upgrade? Isn't writing the FeatureLevelRecord good >>> > enough to communicate the upgrade to the replicas? >>> >>> >>> You're right, we don't necessarily need to _transmit_ a snapshot, since >>> each node can generate its own equivalent snapshot >>> >>> 2. Generate snapshot on downgrade >>> > > Metadata snapshot is generated and sent to the other inactive >>> > controllers and to brokers (this snapshot may be lossy!) >>> > Why do we need to send this downgraded snapshot to the brokers? The >>> > replicas have seen the FeatureLevelRecord and noticed the downgrade. >>> > Can we have the replicas each independently generate a downgraded >>> > snapshot at the offset for the downgraded FeatureLevelRecord? I assume >>> > that the active controller will guara
Re: [DISCUSS] KIP-778 KRaft Upgrades
Hi David, Thanks again for the KIP. David Arthur wrote: > 101. Change AllowDowngrade bool to DowngradeType int8 in > UpgradeFeatureRequest RPC. I'm wondering if we can kind of "cheat" on this > incompatible change since it's not currently in use and totally remove the > old field and leave the versions at 0+. Thoughts? We've been talking about having an automated RPC compatibility checker, and doing something like this might make that more complex. On the other hand, I suppose it could be annotated as a special case. David Arthur wrote: > The active controller should probably validate whatever value is read from > meta.properties against its own range of supported versions (statically > defined in code). If the operator sets a version unsupported by the active > controller, that sounds like a configuration error and we should shutdown. > I'm not sure what other validation we could do here without introducing > ordering dependencies (e.g., must have quorum before initializing the > version) It would be nice if the active controller could validate that a majority of the quorum could use the proposed metadata.version. The active controller should have this information, right? If we don't have recent information from a quorum of voters, we wouldn't be active. David Arthur wrote: > Hey folks, I just updated the KIP with details on proposed changes to the > kafka-features.sh tool. It includes four proposed sub-commands which will > provide the Basic and Advanced functions detailed in KIP-584. Please have a > look, thanks! I like the new sub-commands... seems very clean. Do we need delete as a command separate from downgrade? In KIP-584, we agreed to keep version 0 reserved for "no such feature flag." So downgrading to version 0 should be the same as deletion. On another note, it seems like we should spell out that metadata.version begins at 1 in KRaft clusters, and that it's always 0 in ZK-based clusters. Maybe this is obvious but it would be good to write it down here. It seems like we never got closure on the issue of simplifying feature levels. As I said earlier, I think the min/max thing is not needed for 99% of the use-cases we've talked about for feature levels. Certainly, it's not needed for metadata.version. Having it be mandatory for every feature level, whether it needs it or not, is an extra complication that I think we should get rid of, before this interface becomes set in stone. (Use-cases that need this can simply have two feature flags, X_min and X_max, as I said before.) We probably also want an RPC implemented by both brokers and controllers that will reveal the min and max supported versions for each feature level supported by the server. This is useful for diagnostics. And I suppose we should have a command line option to the features command that broadcasts it to everyone and returns the results (or lack of result, if the connection couldn't be made...) best, Colin
Re: [DISCUSS] KIP-778 KRaft Upgrades
Hey folks, I just updated the KIP with details on proposed changes to the kafka-features.sh tool. It includes four proposed sub-commands which will provide the Basic and Advanced functions detailed in KIP-584. Please have a look, thanks! https://cwiki.apache.org/confluence/display/KAFKA/KIP-778%3A+KRaft+Upgrades#KIP778:KRaftUpgrades-KIP-584Addendum Aside from this change, if there isn't any more feedback on the KIP I'd like to start a vote soon. Cheers, David On Thu, Oct 21, 2021 at 3:09 AM Kowshik Prakasam wrote: > Hi David, > > Thanks for the explanations. Few comments below. > > 7001. Sounds good. > > 7002. Sounds good. The --force-downgrade-all option can be used for the > basic CLI while the --force-downgrade option can be used for the advanced > CLI. > > 7003. I like your suggestion on separate sub-commands, I agree it's more > convenient to use. > > 7004/7005. Your explanation sounds good to me. Regarding the min finalized > version level, this becomes useful for feature version deprecation as > explained here: > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP584:Versioningschemeforfeatures-Featureversiondeprecation > . This is not implemented yet, and the work item is tracked in KAFKA-10622. > > > Cheers, > Kowshik > > > > On Fri, Oct 15, 2021 at 11:38 AM David Arthur wrote: > > > > > > > How does the active controller know what is a valid `metadata.version` > > > to persist? Could the active controller learn this from the > > > ApiVersions response from all of the inactive controllers? > > > > > > The active controller should probably validate whatever value is read > from > > meta.properties against its own range of supported versions (statically > > defined in code). If the operator sets a version unsupported by the > active > > controller, that sounds like a configuration error and we should > shutdown. > > I'm not sure what other validation we could do here without introducing > > ordering dependencies (e.g., must have quorum before initializing the > > version) > > > > For example, let's say that we have a cluster that only has remote > > > controllers, what are the valid metadata.version in that case? > > > > > > I believe it would be the intersection of supported versions across all > > brokers and controllers. This does raise a concern with upgrading the > > metadata.version in general. Currently, the active controller only > > validates the target version based on the brokers' support versions. We > > will need to include controllers supported versions here as well (using > > ApiVersions, probably). > > > > On Fri, Oct 15, 2021 at 1:44 PM José Armando García Sancio > > wrote: > > > > > On Fri, Oct 15, 2021 at 7:24 AM David Arthur wrote: > > > > Hmm. So I think you are proposing the following flow: > > > > > 1. Cluster metadata partition replicas establish a quorum using > > > > > ApiVersions and the KRaft protocol. > > > > > 2. Inactive controllers send a registration RPC to the active > > > controller. > > > > > 3. The active controller persists this information to the metadata > > log. > > > > > > > > > > > > What happens if the inactive controllers send a metadata.version > range > > > > > that is not compatible with the metadata.version set for the > cluster? > > > > > > > > > > > > As we discussed offline, we don't need the explicit registration > step. > > > Once > > > > a controller has joined the quorum, it will learn about the finalized > > > > "metadata.version" level once it reads that record. > > > > > > How does the active controller know what is a valid `metadata.version` > > > to persist? Could the active controller learn this from the > > > ApiVersions response from all of the inactive controllers? For > > > example, let's say that we have a cluster that only has remote > > > controllers, what are the valid metadata.version in that case? > > > > > > > If it encounters a > > > > version it can't support it should probably shutdown since it might > not > > > be > > > > able to process any more records. > > > > > > I think that makes sense. If a controller cannot replay the metadata > > > log, it might as well not be part of the quorum. If the cluster > > > continues in this state it won't guarantee availability based on the > > > replication factor. > > > > > > Thanks > > > -- > > > -Jose > > > > > > > > > -- > > David Arthur > > > -- David Arthur
Re: [DISCUSS] KIP-778 KRaft Upgrades
Hi David, Thanks for the explanations. Few comments below. 7001. Sounds good. 7002. Sounds good. The --force-downgrade-all option can be used for the basic CLI while the --force-downgrade option can be used for the advanced CLI. 7003. I like your suggestion on separate sub-commands, I agree it's more convenient to use. 7004/7005. Your explanation sounds good to me. Regarding the min finalized version level, this becomes useful for feature version deprecation as explained here: https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP584:Versioningschemeforfeatures-Featureversiondeprecation . This is not implemented yet, and the work item is tracked in KAFKA-10622. Cheers, Kowshik On Fri, Oct 15, 2021 at 11:38 AM David Arthur wrote: > > > > How does the active controller know what is a valid `metadata.version` > > to persist? Could the active controller learn this from the > > ApiVersions response from all of the inactive controllers? > > > The active controller should probably validate whatever value is read from > meta.properties against its own range of supported versions (statically > defined in code). If the operator sets a version unsupported by the active > controller, that sounds like a configuration error and we should shutdown. > I'm not sure what other validation we could do here without introducing > ordering dependencies (e.g., must have quorum before initializing the > version) > > For example, let's say that we have a cluster that only has remote > > controllers, what are the valid metadata.version in that case? > > > I believe it would be the intersection of supported versions across all > brokers and controllers. This does raise a concern with upgrading the > metadata.version in general. Currently, the active controller only > validates the target version based on the brokers' support versions. We > will need to include controllers supported versions here as well (using > ApiVersions, probably). > > On Fri, Oct 15, 2021 at 1:44 PM José Armando García Sancio > wrote: > > > On Fri, Oct 15, 2021 at 7:24 AM David Arthur wrote: > > > Hmm. So I think you are proposing the following flow: > > > > 1. Cluster metadata partition replicas establish a quorum using > > > > ApiVersions and the KRaft protocol. > > > > 2. Inactive controllers send a registration RPC to the active > > controller. > > > > 3. The active controller persists this information to the metadata > log. > > > > > > > > > What happens if the inactive controllers send a metadata.version range > > > > that is not compatible with the metadata.version set for the cluster? > > > > > > > > > As we discussed offline, we don't need the explicit registration step. > > Once > > > a controller has joined the quorum, it will learn about the finalized > > > "metadata.version" level once it reads that record. > > > > How does the active controller know what is a valid `metadata.version` > > to persist? Could the active controller learn this from the > > ApiVersions response from all of the inactive controllers? For > > example, let's say that we have a cluster that only has remote > > controllers, what are the valid metadata.version in that case? > > > > > If it encounters a > > > version it can't support it should probably shutdown since it might not > > be > > > able to process any more records. > > > > I think that makes sense. If a controller cannot replay the metadata > > log, it might as well not be part of the quorum. If the cluster > > continues in this state it won't guarantee availability based on the > > replication factor. > > > > Thanks > > -- > > -Jose > > > > > -- > David Arthur >
Re: [DISCUSS] KIP-778 KRaft Upgrades
> > How does the active controller know what is a valid `metadata.version` > to persist? Could the active controller learn this from the > ApiVersions response from all of the inactive controllers? The active controller should probably validate whatever value is read from meta.properties against its own range of supported versions (statically defined in code). If the operator sets a version unsupported by the active controller, that sounds like a configuration error and we should shutdown. I'm not sure what other validation we could do here without introducing ordering dependencies (e.g., must have quorum before initializing the version) For example, let's say that we have a cluster that only has remote > controllers, what are the valid metadata.version in that case? I believe it would be the intersection of supported versions across all brokers and controllers. This does raise a concern with upgrading the metadata.version in general. Currently, the active controller only validates the target version based on the brokers' support versions. We will need to include controllers supported versions here as well (using ApiVersions, probably). On Fri, Oct 15, 2021 at 1:44 PM José Armando García Sancio wrote: > On Fri, Oct 15, 2021 at 7:24 AM David Arthur wrote: > > Hmm. So I think you are proposing the following flow: > > > 1. Cluster metadata partition replicas establish a quorum using > > > ApiVersions and the KRaft protocol. > > > 2. Inactive controllers send a registration RPC to the active > controller. > > > 3. The active controller persists this information to the metadata log. > > > > > > What happens if the inactive controllers send a metadata.version range > > > that is not compatible with the metadata.version set for the cluster? > > > > > > As we discussed offline, we don't need the explicit registration step. > Once > > a controller has joined the quorum, it will learn about the finalized > > "metadata.version" level once it reads that record. > > How does the active controller know what is a valid `metadata.version` > to persist? Could the active controller learn this from the > ApiVersions response from all of the inactive controllers? For > example, let's say that we have a cluster that only has remote > controllers, what are the valid metadata.version in that case? > > > If it encounters a > > version it can't support it should probably shutdown since it might not > be > > able to process any more records. > > I think that makes sense. If a controller cannot replay the metadata > log, it might as well not be part of the quorum. If the cluster > continues in this state it won't guarantee availability based on the > replication factor. > > Thanks > -- > -Jose > -- David Arthur
Re: [DISCUSS] KIP-778 KRaft Upgrades
On Fri, Oct 15, 2021 at 7:24 AM David Arthur wrote: > Hmm. So I think you are proposing the following flow: > > 1. Cluster metadata partition replicas establish a quorum using > > ApiVersions and the KRaft protocol. > > 2. Inactive controllers send a registration RPC to the active controller. > > 3. The active controller persists this information to the metadata log. > > > What happens if the inactive controllers send a metadata.version range > > that is not compatible with the metadata.version set for the cluster? > > > As we discussed offline, we don't need the explicit registration step. Once > a controller has joined the quorum, it will learn about the finalized > "metadata.version" level once it reads that record. How does the active controller know what is a valid `metadata.version` to persist? Could the active controller learn this from the ApiVersions response from all of the inactive controllers? For example, let's say that we have a cluster that only has remote controllers, what are the valid metadata.version in that case? > If it encounters a > version it can't support it should probably shutdown since it might not be > able to process any more records. I think that makes sense. If a controller cannot replay the metadata log, it might as well not be part of the quorum. If the cluster continues in this state it won't guarantee availability based on the replication factor. Thanks -- -Jose
Re: [DISCUSS] KIP-778 KRaft Upgrades
Hey everyone, I've updated the KIP with a few items we've discussed so far: 101. Change AllowDowngrade bool to DowngradeType int8 in UpgradeFeatureRequest RPC. I'm wondering if we can kind of "cheat" on this incompatible change since it's not currently in use and totally remove the old field and leave the versions at 0+. Thoughts? 102. Add section on metadata.version initialization. This includes a new option to kafka-storage.sh to let the operator set the initial version 103. Add section on behavior of controllers and brokers if they encounter an unsupported version Thanks for the great discussion so far! -David On Fri, Oct 15, 2021 at 10:23 AM David Arthur wrote: > Jose, > > Are you saying that for metadata.version X different software versions >> could generate different snapshots? If so, I would consider this an >> implementation bug, no? The format and content of a snapshot is a >> public API that needs to be supported across software versions. > > > I agree this would be a bug. This is probably a non-issue since we will > recommend brokers to be at the same software version before performing an > upgrade or downgrade. That eliminates the possibility that brokers could > generate different snapshots. > > Hmm. So I think you are proposing the following flow: >> 1. Cluster metadata partition replicas establish a quorum using >> ApiVersions and the KRaft protocol. >> 2. Inactive controllers send a registration RPC to the active controller. >> 3. The active controller persists this information to the metadata log. > > > What happens if the inactive controllers send a metadata.version range >> that is not compatible with the metadata.version set for the cluster? > > > As we discussed offline, we don't need the explicit registration step. > Once a controller has joined the quorum, it will learn about the finalized > "metadata.version" level once it reads that record. If it encounters a > version it can't support it should probably shutdown since it might not be > able to process any more records. > > However, this does lead to a race condition when a controller is catching > up on the metadata log. Let's say a controller comes up and fetches the > following records from the leader: > > > [M1, M2, M3, FeatureLevelRecord] > > Between the time the controller starts handling requests and it processes > the FeatureLevelRecord, it could report an old value of metadata.version > when handling ApiVersionsRequest. In the brokers, I believe we delay > starting the request handler threads until the broker has been unfenced > (meaning, it is reasonably caught up). We might need something similar for > controllers. > > -David > > On Thu, Oct 14, 2021 at 9:29 PM David Arthur wrote: > >> Kowshik, thanks for the review! >> >> 7001. An enum sounds like a good idea here. Especially since setting >> Upgrade=false and Force=true doesn't really make sense. An enum would avoid >> confusing/invalid combinations of flags >> >> 7002. How about adding --force-downgrade as an alternative to the >> --downgrade argument? So, it would take the same arguments (list of >> feature:version), but use the DOWNGRADE_UNSAFE option in the RPC. >> >> 7003. Yes we will need the advanced CLI since we will need to only modify >> the "metadata.version" FF. I was kind of wondering if we might want >> separate sub-commands for the different operations instead of all under >> "update". E.g., "kafka-features.sh >> upgrade|downgrade|force-downgrade|delete|describe". >> >> 7004/7005. The intention in this KIP is that we bump the >> "metadata.version" liberally and that most upgrades are backwards >> compatible. We're relying on this for feature gating as well as indicating >> compatibility. The enum is indeed an implementation detail that is enforced >> by the controller when handling UpdateFeaturesRequest. As for lossy >> downgrades, this really only applies to the metadata log as we will lose >> some fields and records when downgrading to an older version. This is >> useful as an escape hatch for cases when a software upgrade has occurred, >> the feature flag was increased, and then bugs are discovered. Without the >> lossy downgrade scenario, we have no path back to a previous software >> version. >> >> As for the min/max finalized version, I'm not totally clear on cases >> where these would differ. I think for "metadata.version" we just want a >> single finalized version for the whole cluster (like we do for IBP). >> >> -David >> >> >> On Thu, Oct 14, 2021 at 1:59 PM José Armando García Sancio >> wrote: >> >>> On Tue, Oct 12, 2021 at 10:57 AM Colin McCabe >>> wrote: >>> > > 11. For downgrades, it would be useful to describe how to determine >>> the >>> > > downgrade process (generating new snapshot, propagating the >>> snapshot, etc) >>> > > has completed. We could block the UpdateFeature request until the >>> process >>> > > is completed. However, since the process could take time, the >>> request could >>> > > time out. Another way is throu
Re: [DISCUSS] KIP-778 KRaft Upgrades
Jose, Are you saying that for metadata.version X different software versions > could generate different snapshots? If so, I would consider this an > implementation bug, no? The format and content of a snapshot is a > public API that needs to be supported across software versions. I agree this would be a bug. This is probably a non-issue since we will recommend brokers to be at the same software version before performing an upgrade or downgrade. That eliminates the possibility that brokers could generate different snapshots. Hmm. So I think you are proposing the following flow: > 1. Cluster metadata partition replicas establish a quorum using > ApiVersions and the KRaft protocol. > 2. Inactive controllers send a registration RPC to the active controller. > 3. The active controller persists this information to the metadata log. What happens if the inactive controllers send a metadata.version range > that is not compatible with the metadata.version set for the cluster? As we discussed offline, we don't need the explicit registration step. Once a controller has joined the quorum, it will learn about the finalized "metadata.version" level once it reads that record. If it encounters a version it can't support it should probably shutdown since it might not be able to process any more records. However, this does lead to a race condition when a controller is catching up on the metadata log. Let's say a controller comes up and fetches the following records from the leader: > [M1, M2, M3, FeatureLevelRecord] Between the time the controller starts handling requests and it processes the FeatureLevelRecord, it could report an old value of metadata.version when handling ApiVersionsRequest. In the brokers, I believe we delay starting the request handler threads until the broker has been unfenced (meaning, it is reasonably caught up). We might need something similar for controllers. -David On Thu, Oct 14, 2021 at 9:29 PM David Arthur wrote: > Kowshik, thanks for the review! > > 7001. An enum sounds like a good idea here. Especially since setting > Upgrade=false and Force=true doesn't really make sense. An enum would avoid > confusing/invalid combinations of flags > > 7002. How about adding --force-downgrade as an alternative to the > --downgrade argument? So, it would take the same arguments (list of > feature:version), but use the DOWNGRADE_UNSAFE option in the RPC. > > 7003. Yes we will need the advanced CLI since we will need to only modify > the "metadata.version" FF. I was kind of wondering if we might want > separate sub-commands for the different operations instead of all under > "update". E.g., "kafka-features.sh > upgrade|downgrade|force-downgrade|delete|describe". > > 7004/7005. The intention in this KIP is that we bump the > "metadata.version" liberally and that most upgrades are backwards > compatible. We're relying on this for feature gating as well as indicating > compatibility. The enum is indeed an implementation detail that is enforced > by the controller when handling UpdateFeaturesRequest. As for lossy > downgrades, this really only applies to the metadata log as we will lose > some fields and records when downgrading to an older version. This is > useful as an escape hatch for cases when a software upgrade has occurred, > the feature flag was increased, and then bugs are discovered. Without the > lossy downgrade scenario, we have no path back to a previous software > version. > > As for the min/max finalized version, I'm not totally clear on cases where > these would differ. I think for "metadata.version" we just want a single > finalized version for the whole cluster (like we do for IBP). > > -David > > > On Thu, Oct 14, 2021 at 1:59 PM José Armando García Sancio > wrote: > >> On Tue, Oct 12, 2021 at 10:57 AM Colin McCabe wrote: >> > > 11. For downgrades, it would be useful to describe how to determine >> the >> > > downgrade process (generating new snapshot, propagating the snapshot, >> etc) >> > > has completed. We could block the UpdateFeature request until the >> process >> > > is completed. However, since the process could take time, the request >> could >> > > time out. Another way is through DescribeFeature and the server only >> > > reports downgraded versions after the process is completed. >> > >> > Hmm.. I think we need to avoid blocking, since we don't know how long >> it will take for all nodes to act on the downgrade request. After all, some >> nodes may be down. >> > >> > But I agree we should have some way of knowing when the upgrade is >> done. DescribeClusterResponse seems like the natural place to put >> information about each node's feature level. While we're at it, we should >> also add a boolean to indicate whether the given node is fenced. (This will >> always be false for ZK mode, of course...) >> > >> >> I agree. I think from the user's point of view, they would like to >> know if it is safe to downgrade the software version of a specific >> broker or controlle
Re: [DISCUSS] KIP-778 KRaft Upgrades
Kowshik, thanks for the review! 7001. An enum sounds like a good idea here. Especially since setting Upgrade=false and Force=true doesn't really make sense. An enum would avoid confusing/invalid combinations of flags 7002. How about adding --force-downgrade as an alternative to the --downgrade argument? So, it would take the same arguments (list of feature:version), but use the DOWNGRADE_UNSAFE option in the RPC. 7003. Yes we will need the advanced CLI since we will need to only modify the "metadata.version" FF. I was kind of wondering if we might want separate sub-commands for the different operations instead of all under "update". E.g., "kafka-features.sh upgrade|downgrade|force-downgrade|delete|describe". 7004/7005. The intention in this KIP is that we bump the "metadata.version" liberally and that most upgrades are backwards compatible. We're relying on this for feature gating as well as indicating compatibility. The enum is indeed an implementation detail that is enforced by the controller when handling UpdateFeaturesRequest. As for lossy downgrades, this really only applies to the metadata log as we will lose some fields and records when downgrading to an older version. This is useful as an escape hatch for cases when a software upgrade has occurred, the feature flag was increased, and then bugs are discovered. Without the lossy downgrade scenario, we have no path back to a previous software version. As for the min/max finalized version, I'm not totally clear on cases where these would differ. I think for "metadata.version" we just want a single finalized version for the whole cluster (like we do for IBP). -David On Thu, Oct 14, 2021 at 1:59 PM José Armando García Sancio wrote: > On Tue, Oct 12, 2021 at 10:57 AM Colin McCabe wrote: > > > 11. For downgrades, it would be useful to describe how to determine the > > > downgrade process (generating new snapshot, propagating the snapshot, > etc) > > > has completed. We could block the UpdateFeature request until the > process > > > is completed. However, since the process could take time, the request > could > > > time out. Another way is through DescribeFeature and the server only > > > reports downgraded versions after the process is completed. > > > > Hmm.. I think we need to avoid blocking, since we don't know how long it > will take for all nodes to act on the downgrade request. After all, some > nodes may be down. > > > > But I agree we should have some way of knowing when the upgrade is done. > DescribeClusterResponse seems like the natural place to put information > about each node's feature level. While we're at it, we should also add a > boolean to indicate whether the given node is fenced. (This will always be > false for ZK mode, of course...) > > > > I agree. I think from the user's point of view, they would like to > know if it is safe to downgrade the software version of a specific > broker or controller. I think that it is safe to downgrade a broker or > controller when the metadata.version has been downgraded and the node > has generated a snapshot with the downgraded metadata.version. > > -- > -Jose > -- David Arthur
Re: [DISCUSS] KIP-778 KRaft Upgrades
On Tue, Oct 12, 2021 at 10:57 AM Colin McCabe wrote: > > 11. For downgrades, it would be useful to describe how to determine the > > downgrade process (generating new snapshot, propagating the snapshot, etc) > > has completed. We could block the UpdateFeature request until the process > > is completed. However, since the process could take time, the request could > > time out. Another way is through DescribeFeature and the server only > > reports downgraded versions after the process is completed. > > Hmm.. I think we need to avoid blocking, since we don't know how long it will > take for all nodes to act on the downgrade request. After all, some nodes may > be down. > > But I agree we should have some way of knowing when the upgrade is done. > DescribeClusterResponse seems like the natural place to put information about > each node's feature level. While we're at it, we should also add a boolean to > indicate whether the given node is fenced. (This will always be false for ZK > mode, of course...) > I agree. I think from the user's point of view, they would like to know if it is safe to downgrade the software version of a specific broker or controller. I think that it is safe to downgrade a broker or controller when the metadata.version has been downgraded and the node has generated a snapshot with the downgraded metadata.version. -- -Jose
Re: [DISCUSS] KIP-778 KRaft Upgrades
On Thu, Oct 7, 2021 at 5:20 PM Jun Rao wrote: > 7. Jose, what control records were you referring? > Hey Jun, in KRaft we have 3 control records. - LeaderChangeMessage - this is persistent in the replica log when a new leader gets elected and the epoch increases. We never included this record in the snapshot so this KIP doesn't need to process these control records. - SnapshotFooterRecord and SnapshotHeaderRecord - these records are control records at the end and beginning of the snapshot respectively. They only exist in the snapshot and never in the replicated log. This KIP needs to make sure that we downgrade these records. Thanks! -Jose
Re: [DISCUSS] KIP-778 KRaft Upgrades
On Tue, Oct 5, 2021 at 8:53 AM David Arthur wrote: > 2. Generate snapshot on downgrade > > > Metadata snapshot is generated and sent to the other inactive > > controllers and to brokers (this snapshot may be lossy!) > > Why do we need to send this downgraded snapshot to the brokers? The > > replicas have seen the FeatureLevelRecord and noticed the downgrade. > > Can we have the replicas each independently generate a downgraded > > snapshot at the offset for the downgraded FeatureLevelRecord? I assume > > that the active controller will guarantee that all records after the > > FatureLevelRecord use the downgraded version. If so, it would be good > > to mention that explicitly. > > > Similar to above, yes a broker that detects a downgrade via > FeatureLevelRecord could generate its own downgrade snapshot and reload its > state from that. This does get a little fuzzy when we consider cases where > brokers are on different software versions and could be generating a > downgrade snapshot for version X, but using different versions of the code. > It might be safer to let the controller generate the snapshot so each > broker (regardless of software version) gets the same records. However, for > upgrades (or downgrades) we expect the whole cluster to be running the same > software version before triggering the metadata.version change, so perhaps > this isn't a likely scenario. Thoughts? > > Are you saying that for metadata.version X different software versions could generate different snapshots? If so, I would consider this an implementation bug, no? The format and content of a snapshot is a public API that needs to be supported across software versions. > 3. Max metadata version > > >For the first release that supports metadata.version, we can simply > > initialize metadata.version with the current (and only) version. For future > > releases, we will need a mechanism to bootstrap a particular version. This > > could be done using the meta.properties file or some similar mechanism. The > > reason we need the allow for a specific initial version is to support the > > use case of starting a Kafka cluster at version X with an older > > metadata.version. > > > I assume that the Active Controller will learn the metadata version of > > the broker through the BrokerRegistrationRequest. How will the Active > > Controller learn about the max metadata version of the inactive > > controller nodes? We currently don't send a registration request from > > the inactive controller to the active controller. > > > This came up during the design, but I neglected to add it to the KIP. We > will need a mechanism for determining the supported features of each > controller similar to how brokers use BrokerRegistrationRequest. Perhaps > controllers could write a FeatureLevelRecord (or similar) to the metadata > log indicating their supported version. WDYT? > Hmm. So I think you are proposing the following flow: 1. Cluster metadata partition replicas establish a quorum using ApiVersions and the KRaft protocol. 2. Inactive controllers send a registration RPC to the active controller. 3. The active controller persists this information to the metadata log. What happens if the inactive controllers send a metadata.version range that is not compatible with the metadata.version set for the cluster? > Why do you need to bootstrap a particular version? Isn't the intent > > that the broker will learn the active metadata version by reading the > > metadata before unfencing? > > > This bootstrapping is needed for when a KRaft cluster is first started. If > we don't have this mechanism, the cluster can't really do anything until > the operator finalizes the metadata.version with the tool. The > bootstrapping will be done by the controller and the brokers will see this > version as a record (like you say). I'll add some text to clarify this. > > Got it. A new cluster will use the metadata.version of the first active controller. The first active controller will persist this information on the metadata log. All replicas (inactive controller and brokers) will configure themselves based on this record. > 4. Reject Registration - This is related to the bullet point above. > > What will be the behavior of the active controller if the broker sends > > a metadata version that is not compatible with the cluster wide > > metadata version? > > > If a broker starts up with a lower supported version range than the current > cluster metadata.version, it should log an error and shutdown. This is in > line with KIP-584. > Okay. We need to extend this for the controller case. Thanks -- -Jose
Re: [DISCUSS] KIP-778 KRaft Upgrades
Hi David, Thanks for the great KIP! It's good to see KIP-584 is starting to get used. Few comments below. 7001. In the UpdateFeaturesRequest definition, there is a newly introduced ForceDowngrade parameter. There is also an existing AllowDowngrade parameter. My understanding is that the AllowDowngrade flag could be set in a FeatureUpdate whenever a downgrade is attempted via the Admin.updateFeatures API. Then, the ForceDowngrade flag could be set if we would like to ask the controller to proceed with the downgrade even if it was deemed unsafe. Could we document the distinction between the two flags in the KIP? If we can just keep one of the flags, that would simplify things. But, as it stands it seems like we will need both flags. To avoid confusions, does it make sense to deprecate the dual boolean flags and instead change the updateFeatures API to use an enum parameter called DOWNGRADE_REQUEST_TYPE with 3 values: NONE (default value meaning no downgrade is allowed), DOWNGRADE_SAFE (ask the controller to downgrade if the operation is safe) and DOWNGRADE_UNSAFE (ask the controller to downgrade disregarding safety)? 7002. The KIP introduces ForceDowngrade flag into UpdateFeaturesRequest definition. But the KIP also introduces a generic --force CLI flag in the kafka-features.sh tool. Should we call the CLI flag instead as --force-downgrade instead so that its intent is more specific? 7003. Currently the kafka-features.sh tool does not yet implement the "Advanced CLI usage" explained in KIP-584. The associated jira is KAFKA-10621. For the needs of this KIP, do you need the advanced CLI or would the basic version work? 7004. KIP-584 feature flag max version is typically incremented to indicate a breaking change. It usually means that version level downgrades would break something. Could this KIP explain why it is useful to support a lossy downgrade for the metadata.version feature flag? i.e. what are some situations when a lossy downgrade is useful? 7005. Regarding downgrade, I read the `enum MetadataVersions` type introduced in this KIP that captures the rules for backwards compatibility. For my understanding, is this enum an implementation detail on the controller specific to this feature flag? i.e. when processing the UpdateFeaturesRequest, would the controller introduce a flag-specific logic (based on the enum values) to decide whether a downgrade is allowed? Cheers, Kowshik On Tue, Oct 12, 2021 at 2:13 PM David Arthur wrote: > Jun and Colin, thanks very much for the comments. See below > > 10. Colin, I agree that ApiVersionsRequest|Response is the most > straightforward approach here > > 11. This brings up an interesting point. Once a UpdateFeature request is > finished processing, a subsequent "describe features" (ApiVersionsRequest) > made by the admin client will go to a random broker, and so might not > reflect the feature update. We could add blocking behavior to the admin > client so it would polls ApiVersions for some time until the expected > version is reflected on that broker. However, this does not mean _every_ > broker has finished upgrading/downgrading -- just the one handling that > request. Maybe we could have the admin client poll all the brokers until > the expected version is seen. > > If at a later time a broker comes online that needs to process an > upgrade/downgrade, I don't think there will be a problem since the broker > will be fenced until it catches up on latest metadata (which will include > the feature level). > > 12. Yes, we will need changes to the admin client for "updateFeatures". > I'll update the KIP to reflect this. > > 13. I'll expand the paragraph on the initial "metadata.version" into its > own section and add some detail. > > 14/15. As mentioned, I think we can avoid this and rely on > brokers/controllers generating their own snapshots. We will probably want > some kind of manual recovery mode where we can force load a snapshot, but > that is out of scope here (I think..) > > 16. Automatic upgrades should be feasible, but I think we will want to > start with manual upgrades (while we work out the design and fix bugs). > Following the design detailed in this KIP, we could have a controller > component that (as you suggest) automatically finalizes the feature to the > max of all broker supported versions. I can include a section on this or we > could defer to a future KIP. WDYT? > > -David > > > On Tue, Oct 12, 2021 at 1:57 PM Colin McCabe wrote: > > > On Thu, Oct 7, 2021, at 17:19, Jun Rao wrote: > > > Hi, David, > > > > > > Thanks for the KIP. A few comments below. > > > > > > 10. It would be useful to describe how the controller node determines > the > > > RPC version used to communicate to other controller nodes. There seems > to > > > be a bootstrap problem. A controller node can't read the log and > > > therefore the feature level until a quorum leader is elected. But > leader > > > election requires an RPC. > > > > > > > Hi Jun, > > > > I agree t
Re: [DISCUSS] KIP-778 KRaft Upgrades
Jun and Colin, thanks very much for the comments. See below 10. Colin, I agree that ApiVersionsRequest|Response is the most straightforward approach here 11. This brings up an interesting point. Once a UpdateFeature request is finished processing, a subsequent "describe features" (ApiVersionsRequest) made by the admin client will go to a random broker, and so might not reflect the feature update. We could add blocking behavior to the admin client so it would polls ApiVersions for some time until the expected version is reflected on that broker. However, this does not mean _every_ broker has finished upgrading/downgrading -- just the one handling that request. Maybe we could have the admin client poll all the brokers until the expected version is seen. If at a later time a broker comes online that needs to process an upgrade/downgrade, I don't think there will be a problem since the broker will be fenced until it catches up on latest metadata (which will include the feature level). 12. Yes, we will need changes to the admin client for "updateFeatures". I'll update the KIP to reflect this. 13. I'll expand the paragraph on the initial "metadata.version" into its own section and add some detail. 14/15. As mentioned, I think we can avoid this and rely on brokers/controllers generating their own snapshots. We will probably want some kind of manual recovery mode where we can force load a snapshot, but that is out of scope here (I think..) 16. Automatic upgrades should be feasible, but I think we will want to start with manual upgrades (while we work out the design and fix bugs). Following the design detailed in this KIP, we could have a controller component that (as you suggest) automatically finalizes the feature to the max of all broker supported versions. I can include a section on this or we could defer to a future KIP. WDYT? -David On Tue, Oct 12, 2021 at 1:57 PM Colin McCabe wrote: > On Thu, Oct 7, 2021, at 17:19, Jun Rao wrote: > > Hi, David, > > > > Thanks for the KIP. A few comments below. > > > > 10. It would be useful to describe how the controller node determines the > > RPC version used to communicate to other controller nodes. There seems to > > be a bootstrap problem. A controller node can't read the log and > > therefore the feature level until a quorum leader is elected. But leader > > election requires an RPC. > > > > Hi Jun, > > I agree that we need to figure this out. I think it would be best to > simply use ApiVersionsRequest and ApiVersionsResponse. That way each > controller can use the appropriate RPC versions when communicating with > each other controller. This would allow us to upgrade them one by one. > > > 11. For downgrades, it would be useful to describe how to determine the > > downgrade process (generating new snapshot, propagating the snapshot, > etc) > > has completed. We could block the UpdateFeature request until the process > > is completed. However, since the process could take time, the request > could > > time out. Another way is through DescribeFeature and the server only > > reports downgraded versions after the process is completed. > > Hmm.. I think we need to avoid blocking, since we don't know how long it > will take for all nodes to act on the downgrade request. After all, some > nodes may be down. > > But I agree we should have some way of knowing when the upgrade is done. > DescribeClusterResponse seems like the natural place to put information > about each node's feature level. While we're at it, we should also add a > boolean to indicate whether the given node is fenced. (This will always be > false for ZK mode, of course...) > > > > > 12. Since we are changing UpdateFeaturesRequest, do we need to change the > > AdminClient api for updateFeatures too? > > > > 13. For the paragraph starting with "In the absence of an operator > defined > > value for metadata.version", in KIP-584, we described how to finalize > > features with New cluster bootstrap. In that case, it's inconvenient for > > the users to have to run an admin tool to finalize the version for each > > feature. Instead, the system detects that the /features path is missing > in > > ZK and thus automatically finalizes every feature with the latest > supported > > version. Could we do something similar in the KRaft mode? > > > > Yes, I think we need to have a section describing how this ties into > creating new clusters. The simplest thing is probably to have the > controller notice that there are no FeatureRecords at all, and then create > a record for the latest metadata.version. This is analogous to how we > assume the latest IBP if no IBP is configured. > > There is also the question of how to create a cluster that starts up with > something other than the latest metadata.version. We could have a config > for that, like initial.metadata.version, or pass a flag to the > controllers... alternately, we could pass a flag to "kafka-storage.sh > format". > > > 14. After the quorum leader generat
Re: [DISCUSS] KIP-778 KRaft Upgrades
On Wed, Sep 29, 2021, at 10:34, David Arthur wrote: > Hey all, > > I'd like to start a discussion around upgrades for KRaft. This design does > not cover any of the ZooKeeper to KRaft migration (which will be covered in > a future KIP). > > The proposal includes the addition of a "metadata.version" feature flag and > deprecates the IBP. A procedure for downgrades is also included in the KIP. > > https://cwiki.apache.org/confluence/x/WAxACw > > Thanks! > David Hi David, Thanks for the KIP! It's good to see us finding a use for the feature flag mechanism. Since this is the first use of the mechanism, I think we should take some time to fix some issues with the original KIP, before the design gets "set in stone," so to speak. 1. It is not necessary to have two version numbers for each feature flag. I believe the rationale for adding this is that a few features would find it useful. However, those features can just define two flags (x_min and x_max). So I think each feature should have only one feature level number rather than a min and a max. This will simplify a lot of things (including your KIP...) 2. kafka-features.sh should use the more modern Argparse4J format rather than the old school "everything is a separate flag" format. This would allow us to have subcommands. For example, we could have an "upgrade" subcommand, a "nodes" subcommand, etc. Check out kafka-storage.sh for an example of how this can work. There are three subcommands (info, format, random-uuid) and they each have a separate set of flags that they take. For example, "kafka-storage.sh format" takes an --ignore-formatted flag, but this is not valid for "kafka-storage.sh info". best, Colin
Re: [DISCUSS] KIP-778 KRaft Upgrades
On Thu, Oct 7, 2021, at 17:19, Jun Rao wrote: > Hi, David, > > Thanks for the KIP. A few comments below. > > 10. It would be useful to describe how the controller node determines the > RPC version used to communicate to other controller nodes. There seems to > be a bootstrap problem. A controller node can't read the log and > therefore the feature level until a quorum leader is elected. But leader > election requires an RPC. > Hi Jun, I agree that we need to figure this out. I think it would be best to simply use ApiVersionsRequest and ApiVersionsResponse. That way each controller can use the appropriate RPC versions when communicating with each other controller. This would allow us to upgrade them one by one. > 11. For downgrades, it would be useful to describe how to determine the > downgrade process (generating new snapshot, propagating the snapshot, etc) > has completed. We could block the UpdateFeature request until the process > is completed. However, since the process could take time, the request could > time out. Another way is through DescribeFeature and the server only > reports downgraded versions after the process is completed. Hmm.. I think we need to avoid blocking, since we don't know how long it will take for all nodes to act on the downgrade request. After all, some nodes may be down. But I agree we should have some way of knowing when the upgrade is done. DescribeClusterResponse seems like the natural place to put information about each node's feature level. While we're at it, we should also add a boolean to indicate whether the given node is fenced. (This will always be false for ZK mode, of course...) > > 12. Since we are changing UpdateFeaturesRequest, do we need to change the > AdminClient api for updateFeatures too? > > 13. For the paragraph starting with "In the absence of an operator defined > value for metadata.version", in KIP-584, we described how to finalize > features with New cluster bootstrap. In that case, it's inconvenient for > the users to have to run an admin tool to finalize the version for each > feature. Instead, the system detects that the /features path is missing in > ZK and thus automatically finalizes every feature with the latest supported > version. Could we do something similar in the KRaft mode? > Yes, I think we need to have a section describing how this ties into creating new clusters. The simplest thing is probably to have the controller notice that there are no FeatureRecords at all, and then create a record for the latest metadata.version. This is analogous to how we assume the latest IBP if no IBP is configured. There is also the question of how to create a cluster that starts up with something other than the latest metadata.version. We could have a config for that, like initial.metadata.version, or pass a flag to the controllers... alternately, we could pass a flag to "kafka-storage.sh format". > 14. After the quorum leader generates a new snapshot, how do we force other > nodes to pick up the new snapshot? > > 15. I agree with Jose that it will be useful to describe when generating a > new snapshot is needed. To me, it seems the new snapshot is only needed > when incompatible changes are made. > I think it would be good to always generate a snapshot right before the upgrade. Then, if the upgrade goes wrong, we have a metadata state we could revert back to, albeit with some effort and potential data loss. But, I agree that the rationale for this should be spelled out in the KIP. I also think that the brokers should generate their own snapshots rather than fetching from the controller, both in the upgrade and downgrade case. Jose mentioned this earlier and I agree. best, Colin > 7. Jose, what control records were you referring? > > Thanks, > > Jun > > > On Tue, Oct 5, 2021 at 8:53 AM David Arthur wrote: > >> Jose, thanks for the thorough review and comments! >> >> I am out of the office until next week, so I probably won't be able to >> update the KIP until then. Here are some replies to your questions: >> >> 1. Generate snapshot on upgrade >> > > Metadata snapshot is generated and sent to the other nodes >> > Why does the Active Controller need to generate a new snapshot and >> > force a snapshot fetch from the replicas (inactive controller and >> > brokers) on an upgrade? Isn't writing the FeatureLevelRecord good >> > enough to communicate the upgrade to the replicas? >> >> >> You're right, we don't necessarily need to _transmit_ a snapshot, since >> each node can generate its own equivalent snapshot >> >> 2. Generate snapshot on downgrade >> > > Metadata snapshot is generated and sent to the other inactive >> > controllers and to brokers (this snapshot may be lossy!) >> > Why do we need to send this downgraded snapshot to the brokers? The >> > replicas have seen the FeatureLevelRecord and noticed the downgrade. >> > Can we have the replicas each independently generate a downgraded >> > snapshot at the offset for th
Re: [DISCUSS] KIP-778 KRaft Upgrades
Hi, David, One more comment. 16. The main reason why KIP-584 requires finalizing a feature manually is that in the ZK world, the controller doesn't know all brokers in a cluster. A broker temporarily down is not registered in ZK. in the KRaft world, the controller keeps track of all brokers, including those that are temporarily down. This makes it possible for the controller to automatically finalize a feature---it's safe to do so when all brokers support that feature. This will make the upgrade process much simpler since no manual command is required to turn on a new feature. Have we considered this? Thanks, Jun On Thu, Oct 7, 2021 at 5:19 PM Jun Rao wrote: > Hi, David, > > Thanks for the KIP. A few comments below. > > 10. It would be useful to describe how the controller node determines the > RPC version used to communicate to other controller nodes. There seems to > be a bootstrap problem. A controller node can't read the log and > therefore the feature level until a quorum leader is elected. But leader > election requires an RPC. > > 11. For downgrades, it would be useful to describe how to determine the > downgrade process (generating new snapshot, propagating the snapshot, etc) > has completed. We could block the UpdateFeature request until the process > is completed. However, since the process could take time, the request could > time out. Another way is through DescribeFeature and the server only > reports downgraded versions after the process is completed. > > 12. Since we are changing UpdateFeaturesRequest, do we need to change the > AdminClient api for updateFeatures too? > > 13. For the paragraph starting with "In the absence of an operator > defined value for metadata.version", in KIP-584, we described how to > finalize features with New cluster bootstrap. In that case, it's > inconvenient for the users to have to run an admin tool to finalize the > version for each feature. Instead, the system detects that the /features > path is missing in ZK and thus automatically finalizes every feature with > the latest supported version. Could we do something similar in the KRaft > mode? > > 14. After the quorum leader generates a new snapshot, how do we force > other nodes to pick up the new snapshot? > > 15. I agree with Jose that it will be useful to describe when generating a > new snapshot is needed. To me, it seems the new snapshot is only needed > when incompatible changes are made. > > 7. Jose, what control records were you referring? > > Thanks, > > Jun > > > On Tue, Oct 5, 2021 at 8:53 AM David Arthur > wrote: > >> Jose, thanks for the thorough review and comments! >> >> I am out of the office until next week, so I probably won't be able to >> update the KIP until then. Here are some replies to your questions: >> >> 1. Generate snapshot on upgrade >> > > Metadata snapshot is generated and sent to the other nodes >> > Why does the Active Controller need to generate a new snapshot and >> > force a snapshot fetch from the replicas (inactive controller and >> > brokers) on an upgrade? Isn't writing the FeatureLevelRecord good >> > enough to communicate the upgrade to the replicas? >> >> >> You're right, we don't necessarily need to _transmit_ a snapshot, since >> each node can generate its own equivalent snapshot >> >> 2. Generate snapshot on downgrade >> > > Metadata snapshot is generated and sent to the other inactive >> > controllers and to brokers (this snapshot may be lossy!) >> > Why do we need to send this downgraded snapshot to the brokers? The >> > replicas have seen the FeatureLevelRecord and noticed the downgrade. >> > Can we have the replicas each independently generate a downgraded >> > snapshot at the offset for the downgraded FeatureLevelRecord? I assume >> > that the active controller will guarantee that all records after the >> > FatureLevelRecord use the downgraded version. If so, it would be good >> > to mention that explicitly. >> >> >> Similar to above, yes a broker that detects a downgrade via >> FeatureLevelRecord could generate its own downgrade snapshot and reload >> its >> state from that. This does get a little fuzzy when we consider cases where >> brokers are on different software versions and could be generating a >> downgrade snapshot for version X, but using different versions of the >> code. >> It might be safer to let the controller generate the snapshot so each >> broker (regardless of software version) gets the same records. However, >> for >> upgrades (or downgrades) we expect the whole cluster to be running the >> same >> software version before triggering the metadata.version change, so perhaps >> this isn't a likely scenario. Thoughts? >> >> >> 3. Max metadata version >> > >For the first release that supports metadata.version, we can simply >> > initialize metadata.version with the current (and only) version. For >> future >> > releases, we will need a mechanism to bootstrap a particular version. >> This >> > could be done using the meta.properties fil
Re: [DISCUSS] KIP-778 KRaft Upgrades
Hi, David, Thanks for the KIP. A few comments below. 10. It would be useful to describe how the controller node determines the RPC version used to communicate to other controller nodes. There seems to be a bootstrap problem. A controller node can't read the log and therefore the feature level until a quorum leader is elected. But leader election requires an RPC. 11. For downgrades, it would be useful to describe how to determine the downgrade process (generating new snapshot, propagating the snapshot, etc) has completed. We could block the UpdateFeature request until the process is completed. However, since the process could take time, the request could time out. Another way is through DescribeFeature and the server only reports downgraded versions after the process is completed. 12. Since we are changing UpdateFeaturesRequest, do we need to change the AdminClient api for updateFeatures too? 13. For the paragraph starting with "In the absence of an operator defined value for metadata.version", in KIP-584, we described how to finalize features with New cluster bootstrap. In that case, it's inconvenient for the users to have to run an admin tool to finalize the version for each feature. Instead, the system detects that the /features path is missing in ZK and thus automatically finalizes every feature with the latest supported version. Could we do something similar in the KRaft mode? 14. After the quorum leader generates a new snapshot, how do we force other nodes to pick up the new snapshot? 15. I agree with Jose that it will be useful to describe when generating a new snapshot is needed. To me, it seems the new snapshot is only needed when incompatible changes are made. 7. Jose, what control records were you referring? Thanks, Jun On Tue, Oct 5, 2021 at 8:53 AM David Arthur wrote: > Jose, thanks for the thorough review and comments! > > I am out of the office until next week, so I probably won't be able to > update the KIP until then. Here are some replies to your questions: > > 1. Generate snapshot on upgrade > > > Metadata snapshot is generated and sent to the other nodes > > Why does the Active Controller need to generate a new snapshot and > > force a snapshot fetch from the replicas (inactive controller and > > brokers) on an upgrade? Isn't writing the FeatureLevelRecord good > > enough to communicate the upgrade to the replicas? > > > You're right, we don't necessarily need to _transmit_ a snapshot, since > each node can generate its own equivalent snapshot > > 2. Generate snapshot on downgrade > > > Metadata snapshot is generated and sent to the other inactive > > controllers and to brokers (this snapshot may be lossy!) > > Why do we need to send this downgraded snapshot to the brokers? The > > replicas have seen the FeatureLevelRecord and noticed the downgrade. > > Can we have the replicas each independently generate a downgraded > > snapshot at the offset for the downgraded FeatureLevelRecord? I assume > > that the active controller will guarantee that all records after the > > FatureLevelRecord use the downgraded version. If so, it would be good > > to mention that explicitly. > > > Similar to above, yes a broker that detects a downgrade via > FeatureLevelRecord could generate its own downgrade snapshot and reload its > state from that. This does get a little fuzzy when we consider cases where > brokers are on different software versions and could be generating a > downgrade snapshot for version X, but using different versions of the code. > It might be safer to let the controller generate the snapshot so each > broker (regardless of software version) gets the same records. However, for > upgrades (or downgrades) we expect the whole cluster to be running the same > software version before triggering the metadata.version change, so perhaps > this isn't a likely scenario. Thoughts? > > > 3. Max metadata version > > >For the first release that supports metadata.version, we can simply > > initialize metadata.version with the current (and only) version. For > future > > releases, we will need a mechanism to bootstrap a particular version. > This > > could be done using the meta.properties file or some similar mechanism. > The > > reason we need the allow for a specific initial version is to support the > > use case of starting a Kafka cluster at version X with an older > > metadata.version. > > > I assume that the Active Controller will learn the metadata version of > > the broker through the BrokerRegistrationRequest. How will the Active > > Controller learn about the max metadata version of the inactive > > controller nodes? We currently don't send a registration request from > > the inactive controller to the active controller. > > > This came up during the design, but I neglected to add it to the KIP. We > will need a mechanism for determining the supported features of each > controller similar to how brokers use BrokerRegistrationRequest. Perhaps > controllers could write a FeatureLe
Re: [DISCUSS] KIP-778 KRaft Upgrades
Jose, thanks for the thorough review and comments! I am out of the office until next week, so I probably won't be able to update the KIP until then. Here are some replies to your questions: 1. Generate snapshot on upgrade > > Metadata snapshot is generated and sent to the other nodes > Why does the Active Controller need to generate a new snapshot and > force a snapshot fetch from the replicas (inactive controller and > brokers) on an upgrade? Isn't writing the FeatureLevelRecord good > enough to communicate the upgrade to the replicas? You're right, we don't necessarily need to _transmit_ a snapshot, since each node can generate its own equivalent snapshot 2. Generate snapshot on downgrade > > Metadata snapshot is generated and sent to the other inactive > controllers and to brokers (this snapshot may be lossy!) > Why do we need to send this downgraded snapshot to the brokers? The > replicas have seen the FeatureLevelRecord and noticed the downgrade. > Can we have the replicas each independently generate a downgraded > snapshot at the offset for the downgraded FeatureLevelRecord? I assume > that the active controller will guarantee that all records after the > FatureLevelRecord use the downgraded version. If so, it would be good > to mention that explicitly. Similar to above, yes a broker that detects a downgrade via FeatureLevelRecord could generate its own downgrade snapshot and reload its state from that. This does get a little fuzzy when we consider cases where brokers are on different software versions and could be generating a downgrade snapshot for version X, but using different versions of the code. It might be safer to let the controller generate the snapshot so each broker (regardless of software version) gets the same records. However, for upgrades (or downgrades) we expect the whole cluster to be running the same software version before triggering the metadata.version change, so perhaps this isn't a likely scenario. Thoughts? 3. Max metadata version > >For the first release that supports metadata.version, we can simply > initialize metadata.version with the current (and only) version. For future > releases, we will need a mechanism to bootstrap a particular version. This > could be done using the meta.properties file or some similar mechanism. The > reason we need the allow for a specific initial version is to support the > use case of starting a Kafka cluster at version X with an older > metadata.version. I assume that the Active Controller will learn the metadata version of > the broker through the BrokerRegistrationRequest. How will the Active > Controller learn about the max metadata version of the inactive > controller nodes? We currently don't send a registration request from > the inactive controller to the active controller. This came up during the design, but I neglected to add it to the KIP. We will need a mechanism for determining the supported features of each controller similar to how brokers use BrokerRegistrationRequest. Perhaps controllers could write a FeatureLevelRecord (or similar) to the metadata log indicating their supported version. WDYT? Why do you need to bootstrap a particular version? Isn't the intent > that the broker will learn the active metadata version by reading the > metadata before unfencing? This bootstrapping is needed for when a KRaft cluster is first started. If we don't have this mechanism, the cluster can't really do anything until the operator finalizes the metadata.version with the tool. The bootstrapping will be done by the controller and the brokers will see this version as a record (like you say). I'll add some text to clarify this. 4. Reject Registration - This is related to the bullet point above. > What will be the behavior of the active controller if the broker sends > a metadata version that is not compatible with the cluster wide > metadata version? If a broker starts up with a lower supported version range than the current cluster metadata.version, it should log an error and shutdown. This is in line with KIP-584. 5. Discover upgrade > > This snapshot will be a convenient way to let broker and controller > components rebuild their entire in-memory state following an upgrade. > Can we rely on the presence of the FeatureLevelRecord for the metadata > version for this functionality? If so, it avoids having to reload the > snapshot. For upgrades, yes probably since we won't need to "rewrite" any records in this case. For downgrades, we will need to generate the snapshot and reload everything. 6. Metadata version specification > > V4(version=4, isBackwardsCompatible=false, description="New metadata > record type Bar"), Very cool. Do you have plans to generate Apache Kafka HTML > documentation for this information? Would be helpful to display this > information to the user using the kafka-features.sh and feature RPC? Hm good idea :) I'll add a brief section on documentation. This would certainly be very useful 7.Downg
Re: [DISCUSS] KIP-778 KRaft Upgrades
One more comment. 7.Downgrade records I think we should explicitly mention that the downgrade process will downgrade both metadata records and controller records. In KIP-630 we introduced two control records for snapshots.
Re: [DISCUSS] KIP-778 KRaft Upgrades
Thank you David for the detailed KIP. 1. Generate snapshot on upgrade > Metadata snapshot is generated and sent to the other nodes Why does the Active Controller need to generate a new snapshot and force a snapshot fetch from the replicas (inactive controller and brokers) on an upgrade? Isn't writing the FeatureLevelRecord good enough to communicate the upgrade to the replicas? 2. Generate snapshot on downgrade > Metadata snapshot is generated and sent to the other inactive controllers and > to brokers (this snapshot may be lossy!) Why do we need to send this downgraded snapshot to the brokers? The replicas have seen the FeatureLevelRecord and noticed the downgrade. Can we have the replicas each independently generate a downgraded snapshot at the offset for the downgraded FeatureLevelRecord? I assume that the active controller will guarantee that all records after the FatureLevelRecord use the downgraded version. If so, it would be good to mention that explicitly. 3. Max metadata version >For the first release that supports metadata.version, we can simply initialize >metadata.version with the current (and only) version. For future releases, we >will need a mechanism to bootstrap a particular version. This could be done >using the meta.properties file or some similar mechanism. The reason we need >the allow for a specific initial version is to support the use case of >starting a Kafka cluster at version X with an older metadata.version. I assume that the Active Controller will learn the metadata version of the broker through the BrokerRegistrationRequest. How will the Active Controller learn about the max metadata version of the inactive controller nodes? We currently don't send a registration request from the inactive controller to the active controller. Why do you need to bootstrap a particular version? Isn't the intent that the broker will learn the active metadata version by reading the metadata before unfencing? 4. Reject Registration - This is related to the bullet point above. What will be the behavior of the active controller if the broker sends a metadata version that is not compatible with the cluster wide metadata version? 5. Discover upgrade > This snapshot will be a convenient way to let broker and controller > components rebuild their entire in-memory state following an upgrade. Can we rely on the presence of the FeatureLevelRecord for the metadata version for this functionality? If so, it avoids having to reload the snapshot. 6. Metadata version specification > V4(version=4, isBackwardsCompatible=false, description="New metadata record > type Bar"), Very cool. Do you have plans to generate Apache Kafka HTML documentation for this information? Would be helpful to display this information to the user using the kafka-features.sh and feature RPC? -- -Jose