We have two approaches here for how we update unstable metadata versions.

   1. The update will only increase MVs of unstable features to a value
   greater than the new stable feature. The idea is that a specific unstable
   MV may support some set of features and in the future that set is always a
   strict subset of the current set. The issue is that moving a feature to
   make way for a stable feature with a higher MV will leave holes.
   2. We are free to reorder the MV for any unstable feature. This removes
   the hole issue, but does make the unstable MVs more muddled. There isn't
   the same binary state for a MV where a feature is available or there is a
   hole.


We also have two ends of the spectrum as to when we update the stable MV.

   1. We update at release points which reduces the amount of churn of the
   unstable MVs and makes a stronger correlation between accepted features and
   stable MVs for a release but means less testing on trunk as a stable MV.
   2. We update when the developers of a feature think it is done. This
   leads to features being available for more testing in trunk but forces the
   next release to include it as stable.


I'd like more feedback from others on these two dimensions.
--Proven



On Wed, Jan 10, 2024 at 12:16 PM Justine Olshan
<jols...@confluent.io.invalid> wrote:

> Hmm it seems like Colin and Proven are disagreeing with whether we can swap
> unstable metadata versions.
>
> >  When we reorder, we are always allocating a new MV and we are never
> reusing an existing MV even if it was also unstable.
>
> > Given that this is true, there's no reason to have special rules about
> what we can and can't do with unstable MVs. We can do anything
>
> I don't have a strong preference either way, but I think we should agree on
> one approach.
> The benefit of reordering and reusing is that we can release features that
> are ready earlier and we have more flexibility. With the approach where we
> always create a new MV, I am concerned with having many "empty" MVs. This
> would encourage waiting until the release before we decide an incomplete
> feature is not ready and moving its MV into the future. (The
> abandoning comment I made earlier -- that is consistent with Proven's
> approach)
>
> I think the only potential issue with reordering is that it could be a bit
> confusing and *potentially *prone to errors. Note I say potentially because
> I think it depends on folks' understanding with this new unstable metadata
> version concept. I echo Federico's comments about making sure the risks are
> highlighted.
>
> Thanks,
>
> Justine
>
> On Wed, Jan 10, 2024 at 1:16 AM Federico Valeri <fedeval...@gmail.com>
> wrote:
>
> > Hi folks,
> >
> > > If you use an unstable MV, you probably won't be able to upgrade your
> > software. Because whenever something changes, you'll probably get
> > serialization exceptions being thrown inside the controller. Fatal ones.
> >
> > Thanks for this clarification. I think this concrete risk should be
> > highlighted in the KIP and in the "unstable.metadata.versions.enable"
> > documentation.
> >
> > In the test plan, should we also have one system test checking that
> > "features with a stable MV will never have that MV changed"?
> >
> > On Wed, Jan 10, 2024 at 8:16 AM Colin McCabe <cmcc...@apache.org> wrote:
> > >
> > > On Tue, Jan 9, 2024, at 18:56, Proven Provenzano wrote:
> > > > Hi folks,
> > > >
> > > > Thank you for the questions.
> > > >
> > > > Let me clarify about reorder first. The reorder of unstable metadata
> > > > versions should be infrequent.
> > >
> > > Why does it need to be infrequent? We should be able to reorder
> unstable
> > metadata versions as often as we like. There are no guarantees about
> > unstable MVs.
> > >
> > > > The time you reorder is when a feature that
> > > > requires a higher metadata version to enable becomes "production
> > ready" and
> > > > the features with unstable metadata versions less than the new stable
> > one
> > > > are moved to metadata versions greater than the new stable feature.
> > When we
> > > > reorder, we are always allocating a new MV and we are never reusing
> an
> > > > existing MV even if it was also unstable. This way a developer
> > upgrading
> > > > their environment with a specific unstable MV might see existing
> > > > functionality stop working but they won't see new MV dependent
> > > > functionality magically appear. The feature set for a given unstable
> MV
> > > > version can only decrease with reordering.
> > >
> > > If you use an unstable MV, you probably won't be able to upgrade your
> > software. Because whenever something changes, you'll probably get
> > serialization exceptions being thrown inside the controller. Fatal ones.
> > >
> > > Given that this is true, there's no reason to have special rules about
> > what we can and can't do with unstable MVs. We can do anything.
> > >
> > > >
> > > > How do we define "production ready" and when should we bump
> > > > LATEST_PRODUCTION? I would like to define it to be the point where
> the
> > > > feature is code complete with tests and the KIP for it is approved.
> > However
> > > > even with this definition if the feature later develops a major issue
> > it
> > > > could still block future features until the issue is fixed which is
> > what we
> > > > are trying to avoid here. We could be much more formal about this and
> > let
> > > > the release manager for a release define what is stable for a given
> > release
> > > > and then do the bump just after the branch is created on the branch.
> > When
> > > > an RC candidate is accepted, the bump would be backported. I would
> > like to
> > > > hear other ideas here.
> > > >
> > >
> > > Yeah, it's an interesting question. Overall, I think developers should
> > define when a feature is production ready.
> > >
> > > The question to ask is, "are you ready to take this feature to
> > production in your workplace?" I think most developers do have a sense of
> > this. Obviously bugs and mistakes can happen, but I think this standard
> > would avoid most of the issues that we're trying to avoid by having
> > unstable MVs in the first place.
> > >
> > > ELR is a good example. Nobody would have said that it was production
> > ready in 3.7 ... hence it belonged (and still belongs) in an unstable MV,
> > until that changes (hopefully soon :) )
> > >
> > > best,
> > > Colin
> > >
> > > > --Proven
> > > >
> > > > On Tue, Jan 9, 2024 at 3:26 PM Colin McCabe <cmcc...@apache.org>
> > wrote:
> > > >
> > > >> Hi Justine,
> > > >>
> > > >> Yes, this is an important point to clarify. Proven can comment more,
> > but
> > > >> my understanding is that we can do anything to unstable metadata
> > versions.
> > > >> Reorder them, delete them, change them in any other way. There are
> no
> > > >> stability guarantees. If the current text is unclear let's add more
> > > >> examples of what we can do (which is anything) :)
> > > >>
> > > >> best,
> > > >> Colin
> > > >>
> > > >>
> > > >> On Mon, Jan 8, 2024, at 14:18, Justine Olshan wrote:
> > > >> > Hey Colin,
> > > >> >
> > > >> > I had some offline discussions with Proven previously and it seems
> > like
> > > >> he
> > > >> > said something different so I'm glad I brought it up here.
> > > >> >
> > > >> > Let's clarify if we are ok with reordering unstable metadata
> > versions :)
> > > >> >
> > > >> > Justine
> > > >> >
> > > >> > On Mon, Jan 8, 2024 at 1:56 PM Colin McCabe <cmcc...@apache.org>
> > wrote:
> > > >> >
> > > >> >> On Mon, Jan 8, 2024, at 13:19, Justine Olshan wrote:
> > > >> >> > Hey all,
> > > >> >> >
> > > >> >> > I was wondering how often we plan to update LATEST_PRODUCTION
> > metadata
> > > >> >> > version. Is this something we should do as soon as the feature
> is
> > > >> >> complete
> > > >> >> > or something we do when we are releasing kafka. When is the
> time
> > we
> > > >> >> abandon
> > > >> >> > a MV so that other features can be unblocked?
> > > >> >>
> > > >> >> Hi Justine,
> > > >> >>
> > > >> >> Thanks for reviewing.
> > > >> >>
> > > >> >> The idea is that you should bump LATEST_PRODUCTION when you want
> to
> > > >> take a
> > > >> >> feature to production. That could mean deploying it internally
> > > >> somewhere to
> > > >> >> production, or doing an Apache release that lets everyone deploy
> > the
> > > >> thing
> > > >> >> to production.
> > > >> >>
> > > >> >> Not in production? No need to care about this. Make any changes
> you
> > > >> like.
> > > >> >>
> > > >> >> As a corollary, we should keep the LATEST_PRODUCTION version as
> > low as
> > > >> it
> > > >> >> can be. If you haven't tested the feature, don't freeze it in
> > stone yet.
> > > >> >>
> > > >> >> >
> > > >> >> > I am just considering a feature that may end up missing a
> > release. It
> > > >> >> seems
> > > >> >> > like maybe that MV would block future metadata versions until
> we
> > > >> decide
> > > >> >> the
> > > >> >> > feature won't make the cut. From that point, all "ready"
> features
> > > >> should
> > > >> >> be
> > > >> >> > able to be released.
> > > >> >>
> > > >> >> The intention is the opposite. A feature in an unstable metadata
> > version
> > > >> >> doesn't block anything. You can always move a feature from one
> > unstable
> > > >> >> metadata version to another if the feature starts taking too long
> > to
> > > >> finish.
> > > >> >>
> > > >> >> > I'm also wondering if the KIP should include some information
> > about
> > > >> how a
> > > >> >> > metadata should be abandoned. Maybe there is a specific message
> > to
> > > >> write
> > > >> >> in
> > > >> >> > the file? So folks who were maybe waiting on that version know
> > they
> > > >> can
> > > >> >> > release their feature?
> > > >> >> >
> > > >> >> > I am also assuming that we don't shift all the waiting metadata
> > > >> versions
> > > >> >> > when we abandon a version, but it would be good to clarify and
> > > >> include in
> > > >> >> > the KIP.
> > > >> >>
> > > >> >> I'm not sure what you mean by abandoning a version. We never
> > abandon a
> > > >> >> version once it's stable.
> > > >> >>
> > > >> >> Unstable versions can change. I wouldn't describe this as
> > "abandonment",
> > > >> >> just the MV changing prior to release.
> > > >> >>
> > > >> >> In a similar way, the contents of the 3.7 branch will change up
> > until
> > > >> >> 3.7.0 is released. Once it gets released, it's never unreleased.
> > We just
> > > >> >> move on to 3.7.1. Same thing here.
> > > >> >>
> > > >> >> best,
> > > >> >> Colin
> > > >> >>
> > > >> >> >
> > > >> >> > Thanks,
> > > >> >> >
> > > >> >> > Justine
> > > >> >> >
> > > >> >> > On Mon, Jan 8, 2024 at 12:44 PM Colin McCabe <
> cmcc...@apache.org
> > >
> > > >> wrote:
> > > >> >> >
> > > >> >> >> Hi Proven,
> > > >> >> >>
> > > >> >> >> Thanks for the KIP. I think there is a need for this
> > capability, for
> > > >> >> those
> > > >> >> >> of us who deploy from trunk (or branches dervied from trunk).
> > > >> >> >>
> > > >> >> >> With regard to "unstable.metadata.versions.enable": is this
> > going to
> > > >> be
> > > >> >> a
> > > >> >> >> documented configuration, or an internal one? I am guessing we
> > want
> > > >> it
> > > >> >> to
> > > >> >> >> be documented, so that users can use it. If we do, we should
> > probably
> > > >> >> also
> > > >> >> >> very prominently warn that THIS WILL BREAK UPGRADES FOR YOUR
> > CLUSTER.
> > > >> >> That
> > > >> >> >> includes logging an ERROR message on startup, etc.
> > > >> >> >>
> > > >> >> >> It would be good to document if a release can go out that
> > contains
> > > >> >> "future
> > > >> >> >> MVs" that are unstable. Like can we make a 3.8 release that
> > contains
> > > >> >> >> IBP_4_0_IV0 in MetadataVersion.java, as an unstable future MV?
> > > >> >> Personally I
> > > >> >> >> think the answer should be "yes," but with the usual caveats.
> > When
> > > >> the
> > > >> >> >> actual 4.0 comes out, the unstable 4.0 MV that shipped in 3.8
> > > >> probably
> > > >> >> >> won't work, and you won't be able to upgrade. (It was
> unstable,
> > we
> > > >> told
> > > >> >> you
> > > >> >> >> not to use it.)
> > > >> >> >>
> > > >> >> >> best,
> > > >> >> >> Colin
> > > >> >> >>
> > > >> >> >>
> > > >> >> >> On Fri, Jan 5, 2024, at 07:32, Proven Provenzano wrote:
> > > >> >> >> > Hey folks,
> > > >> >> >> >
> > > >> >> >> > I am starting a discussion thread for managing unstable
> > metadata
> > > >> >> >> > versions
> > > >> >> >> > in Apache Kafka.
> > > >> >> >> >
> > > >> >> >>
> > > >> >>
> > > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1014%3A+Managing+Unstable+Metadata+Versions+in+Apache+Kafka
> > > >> >> >> >
> > > >> >> >> > This KIP is actually already implemented in 3.7 with PR
> > > >> >> >> > https://github.com/apache/kafka/pull/14860.
> > > >> >> >> > I have created this KIP to explain the motivation and how
> > managing
> > > >> >> >> Metadata
> > > >> >> >> > Versions is expected to work.
> > > >> >> >> > Comments are greatly appreciated as this process can always
> be
> > > >> >> improved.
> > > >> >> >> >
> > > >> >> >> > --
> > > >> >> >> > --Proven
> > > >> >> >>
> > > >> >>
> > > >>
> >
>

Reply via email to