Re: [DISCUSS] KIP-778 KRaft Upgrades

Jun Rao Thu, 04 Nov 2021 15:51:27 -0700

Hi, David, Jose and Colin,

Thanks for the reply. A few more comments.


12. It seems that we haven't updated the AdminClient accordingly?

14. "Metadata snapshot is generated and sent to the other inactive
controllers and to brokers". I thought we wanted each broker to generate
its own snapshot independently? If only the controller generates the
snapshot, how do we force other brokers to pick it up?

16. If a feature version is new, one may not want to enable it immediately
after the cluster is upgraded. However, if a feature version has been
stable, requiring every user to run a command to upgrade to that version
seems inconvenient. One way to improve this is for each feature to define
one version as the default. Then, when we upgrade a cluster, we will
automatically upgrade the feature to the default version. An admin could
use the tool to upgrade to a version higher than the default.

20. "The quorum controller can assist with this process by generating a
metadata snapshot after a metadata.version increase has been committed to
the metadata log. This snapshot will be a convenient way to let broker and
controller components rebuild their entire in-memory state following an
upgrade." The new version of the software could read both the new and the
old version. Is generating a new snapshot during upgrade needed?

Jun


On Wed, Nov 3, 2021 at 5:42 PM Colin McCabe <cmcc...@apache.org> wrote:

> On Tue, Oct 12, 2021, at 10:34, Jun Rao wrote:
> > Hi, David,
> >
> > One more comment.
> >
> > 16. The main reason why KIP-584 requires finalizing a feature manually is
> > that in the ZK world, the controller doesn't know all brokers in a
> cluster.
> > A broker temporarily down is not registered in ZK. in the KRaft world,
> the
> > controller keeps track of all brokers, including those that are
> temporarily
> > down. This makes it possible for the controller to automatically
> finalize a
> > feature---it's safe to do so when all brokers support that feature. This
> > will make the upgrade process much simpler since no manual command is
> > required to turn on a new feature. Have we considered this?
> >
> > Thanks,
> >
> > Jun
>
> Hi Jun,
>
> I guess David commented on this point already, but I'll comment as well. I
> always had the perception that users viewed rolls as potentially risky and
> were looking for ways to reduce the risk. Not enabling features right away
> after installing new software seems like one way to do that. If we had a
> feature to automatically upgrade during a roll, I'm not sure that I would
> recommend that people use it, because if something fails, it makes it
> harder to tell if the new feature is at fault, or something else in the new
> software.
>
> We already tell users to do a "double roll" when going to a new IBP. (Just
> to give background to people who haven't heard that phrase, the first roll
> installs the new software, and the second roll updates the IBP). So this
> KIP-778 mechanism is basically very similar to that, except the second
> thing isn't a roll, but just an upgrade command. So I think this is
> consistent with what we currently do.
>
> Also, just like David said, we can always add auto-upgrade later if there
> is demand...
>
> best,
> Colin
>
>
> >
> > On Thu, Oct 7, 2021 at 5:19 PM Jun Rao <j...@confluent.io> wrote:
> >
> >> Hi, David,
> >>
> >> Thanks for the KIP. A few comments below.
> >>
> >> 10. It would be useful to describe how the controller node determines
> the
> >> RPC version used to communicate to other controller nodes. There seems
> to
> >> be a bootstrap problem. A controller node can't read the log and
> >> therefore the feature level until a quorum leader is elected. But leader
> >> election requires an RPC.
> >>
> >> 11. For downgrades, it would be useful to describe how to determine the
> >> downgrade process (generating new snapshot, propagating the snapshot,
> etc)
> >> has completed. We could block the UpdateFeature request until the
> process
> >> is completed. However, since the process could take time, the request
> could
> >> time out. Another way is through DescribeFeature and the server only
> >> reports downgraded versions after the process is completed.
> >>
> >> 12. Since we are changing UpdateFeaturesRequest, do we need to change
> the
> >> AdminClient api for updateFeatures too?
> >>
> >> 13. For the paragraph starting with "In the absence of an operator
> >> defined value for metadata.version", in KIP-584, we described how to
> >> finalize features with New cluster bootstrap. In that case, it's
> >> inconvenient for the users to have to run an admin tool to finalize the
> >> version for each feature. Instead, the system detects that the /features
> >> path is missing in ZK and thus automatically finalizes every feature
> with
> >> the latest supported version. Could we do something similar in the KRaft
> >> mode?
> >>
> >> 14. After the quorum leader generates a new snapshot, how do we force
> >> other nodes to pick up the new snapshot?
> >>
> >> 15. I agree with Jose that it will be useful to describe when
> generating a
> >> new snapshot is needed. To me, it seems the new snapshot is only needed
> >> when incompatible changes are made.
> >>
> >> 7. Jose, what control records were you referring?
> >>
> >> Thanks,
> >>
> >> Jun
> >>
> >>
> >> On Tue, Oct 5, 2021 at 8:53 AM David Arthur <davidart...@apache.org>
> >> wrote:
> >>
> >>> Jose, thanks for the thorough review and comments!
> >>>
> >>> I am out of the office until next week, so I probably won't be able to
> >>> update the KIP until then. Here are some replies to your questions:
> >>>
> >>> 1. Generate snapshot on upgrade
> >>> > > Metadata snapshot is generated and sent to the other nodes
> >>> > Why does the Active Controller need to generate a new snapshot and
> >>> > force a snapshot fetch from the replicas (inactive controller and
> >>> > brokers) on an upgrade? Isn't writing the FeatureLevelRecord good
> >>> > enough to communicate the upgrade to the replicas?
> >>>
> >>>
> >>> You're right, we don't necessarily need to _transmit_ a snapshot, since
> >>> each node can generate its own equivalent snapshot
> >>>
> >>> 2. Generate snapshot on downgrade
> >>> > > Metadata snapshot is generated and sent to the other inactive
> >>> > controllers and to brokers (this snapshot may be lossy!)
> >>> > Why do we need to send this downgraded snapshot to the brokers? The
> >>> > replicas have seen the FeatureLevelRecord and noticed the downgrade.
> >>> > Can we have the replicas each independently generate a downgraded
> >>> > snapshot at the offset for the downgraded FeatureLevelRecord? I
> assume
> >>> > that the active controller will guarantee that all records after the
> >>> > FatureLevelRecord use the downgraded version. If so, it would be good
> >>> > to mention that explicitly.
> >>>
> >>>
> >>> Similar to above, yes a broker that detects a downgrade via
> >>> FeatureLevelRecord could generate its own downgrade snapshot and reload
> >>> its
> >>> state from that. This does get a little fuzzy when we consider cases
> where
> >>> brokers are on different software versions and could be generating a
> >>> downgrade snapshot for version X, but using different versions of the
> >>> code.
> >>> It might be safer to let the controller generate the snapshot so each
> >>> broker (regardless of software version) gets the same records. However,
> >>> for
> >>> upgrades (or downgrades) we expect the whole cluster to be running the
> >>> same
> >>> software version before triggering the metadata.version change, so
> perhaps
> >>> this isn't a likely scenario. Thoughts?
> >>>
> >>>
> >>> 3. Max metadata version
> >>> > >For the first release that supports metadata.version, we can simply
> >>> > initialize metadata.version with the current (and only) version. For
> >>> future
> >>> > releases, we will need a mechanism to bootstrap a particular version.
> >>> This
> >>> > could be done using the meta.properties file or some similar
> mechanism.
> >>> The
> >>> > reason we need the allow for a specific initial version is to support
> >>> the
> >>> > use case of starting a Kafka cluster at version X with an older
> >>> > metadata.version.
> >>>
> >>>
> >>> I assume that the Active Controller will learn the metadata version of
> >>> > the broker through the BrokerRegistrationRequest. How will the Active
> >>> > Controller learn about the max metadata version of the inactive
> >>> > controller nodes? We currently don't send a registration request from
> >>> > the inactive controller to the active controller.
> >>>
> >>>
> >>> This came up during the design, but I neglected to add it to the KIP.
> We
> >>> will need a mechanism for determining the supported features of each
> >>> controller similar to how brokers use BrokerRegistrationRequest.
> Perhaps
> >>> controllers could write a FeatureLevelRecord (or similar) to the
> metadata
> >>> log indicating their supported version. WDYT?
> >>>
> >>> Why do you need to bootstrap a particular version? Isn't the intent
> >>> > that the broker will learn the active metadata version by reading the
> >>> > metadata before unfencing?
> >>>
> >>>
> >>> This bootstrapping is needed for when a KRaft cluster is first
> started. If
> >>> we don't have this mechanism, the cluster can't really do anything
> until
> >>> the operator finalizes the metadata.version with the tool. The
> >>> bootstrapping will be done by the controller and the brokers will see
> this
> >>> version as a record (like you say). I'll add some text to clarify this.
> >>>
> >>>
> >>> 4. Reject Registration - This is related to the bullet point above.
> >>> > What will be the behavior of the active controller if the broker
> sends
> >>> > a metadata version that is not compatible with the cluster wide
> >>> > metadata version?
> >>>
> >>>
> >>> If a broker starts up with a lower supported version range than the
> >>> current
> >>> cluster metadata.version, it should log an error and shutdown. This is
> in
> >>> line with KIP-584.
> >>>
> >>> 5. Discover upgrade
> >>> > > This snapshot will be a convenient way to let broker and controller
> >>> > components rebuild their entire in-memory state following an upgrade.
> >>> > Can we rely on the presence of the FeatureLevelRecord for the
> metadata
> >>> > version for this functionality? If so, it avoids having to reload the
> >>> > snapshot.
> >>>
> >>>
> >>> For upgrades, yes probably since we won't need to "rewrite" any
> records in
> >>> this case. For downgrades, we will need to generate the snapshot and
> >>> reload
> >>> everything.
> >>>
> >>> 6. Metadata version specification
> >>> > >  V4(version=4, isBackwardsCompatible=false, description="New
> metadata
> >>> > record type Bar"),
> >>>
> >>>
> >>> Very cool. Do you have plans to generate Apache Kafka HTML
> >>> > documentation for this information? Would be helpful to display this
> >>> > information to the user using the kafka-features.sh and feature RPC?
> >>>
> >>>
> >>> Hm good idea :) I'll add a brief section on documentation. This would
> >>> certainly be very useful
> >>>
> >>> 7.Downgrade records
> >>> > I think we should explicitly mention that the downgrade process will
> >>> > downgrade both metadata records and controller records. In KIP-630 we
> >>> > introduced two control records for snapshots.
> >>>
> >>>
> >>> Yes, good call. Let me re-read that KIP and include some details.
> >>>
> >>> Thanks again for the comments!
> >>> -David
> >>>
> >>>
> >>> On Wed, Sep 29, 2021 at 5:09 PM José Armando García Sancio
> >>> <jsan...@confluent.io.invalid> wrote:
> >>>
> >>> > One more comment.
> >>> >
> >>> > 7.Downgrade records
> >>> > I think we should explicitly mention that the downgrade process will
> >>> > downgrade both metadata records and controller records. In KIP-630 we
> >>> > introduced two control records for snapshots.
> >>> >
> >>>
> >>
>

Re: [DISCUSS] KIP-778 KRaft Upgrades

Reply via email to