Hi Colin,

Thanks for the suggestions. It is a good idea to refer to the
'__data_version__' as just 'epoch', to avoid any confusions. However note
that this is not the same as broker epoch. The main distinction is that
this epoch is bumped by the controller whenever a modification made to the
finalized feature versions is persisted into ZK.

I have updated the KIP to use the new schema for the ‘/features’ ZK node:

   - We use 2 separate fields ‘epoch’ and ‘version’. The latter describing
   changes to the overall schema of the data that is written to ZooKeeper in
   the '/features' node.
   - We don’t have a header and a data section separately, I have clubbed
   these so that we have just 1 dictionary containing both.


Here is a link to the updated section:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-584
<https://issues.apache.org/jira/browse/KIP-584>
%3A+Versioning+scheme+for+features#KIP-584
<https://issues.apache.org/jira/browse/KIP-584>:Versioningschemeforfeatures-Persistenceoffinalizedfeatureversions
.

Please feel free to let me know if you have any questions or concerns.


Cheers,
Kowshik


On Mon, Mar 30, 2020 at 4:53 PM Colin McCabe <cmcc...@apache.org> wrote:

> On Thu, Mar 26, 2020, at 19:24, Kowshik Prakasam wrote:
> > Hi Colin,
> >
> > Thanks for the feedback! I've changed the KIP to address your
> > suggestions.
> > Please find below my explanation. Here is a link to KIP 584:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features
> > .
> >
> > 1. '__data_version__' is the version of the finalized feature metadata
> > (i.e. actual ZK node contents), while the '__schema_version__' is the
> > version of the schema of the data persisted in ZK. These serve different
> > purposes. '__data_version__' is is useful mainly to clients during reads,
> > to differentiate between the 2 versions of eventually consistent
> 'finalized
> > features' metadata (i.e. larger metadata version is more recent).
> > '__schema_version__' provides an additional degree of flexibility, where
> if
> > we decide to change the schema for '/features' node in ZK (in the
> future),
> > then we can manage broker roll outs suitably (i.e.
> > serialization/deserialization of the ZK data can be handled safely).
>
> Hi Kowshik,
>
> If you're talking about a number that lets you know if data is more or
> less recent, we would typically call that an epoch, and not a version.  For
> the ZK data structures, the word "version" is typically reserved for
> describing changes to the overall schema of the data that is written to
> ZooKeeper.  We don't even really change the "version" of those schemas that
> much, since most changes are backwards-compatible.  But we do include that
> version field just in case.
>
> I don't think we really need an epoch here, though, since we can just look
> at the broker epoch.  Whenever the broker registers, its epoch will be
> greater than the previous broker epoch.  And the newly registered data will
> take priority.  This will be a lot simpler than adding a separate epoch
> system, I think.
>
> >
> > 2. Regarding admin client needing min and max information - you are
> right!
> > I've changed the KIP such that the Admin API also allows the user to read
> > 'supported features' from a specific broker. Please look at the section
> > "Admin API changes".
>
> Thanks.
>
> >
> > 3. Regarding the use of `long` vs `Long` - it was not deliberate. I've
> > improved the KIP to just use `long` at all places.
>
> Sounds good.
>
> >
> > 4. Regarding kafka.admin.FeatureCommand tool - you are right! I've
> updated
> > the KIP sketching the functionality provided by this tool, with some
> > examples. Please look at the section "Tooling support examples".
> >
> > Thank you!
>
>
> Thanks, Kowshik.
>
> cheers,
> Colin
>
> >
> >
> > Cheers,
> > Kowshik
> >
> > On Wed, Mar 25, 2020 at 11:31 PM Colin McCabe <cmcc...@apache.org>
> wrote:
> >
> > > Thanks, Kowshik, this looks good.
> > >
> > > In the "Schema" section, do we really need both __schema_version__ and
> > > __data_version__?  Can we just have a single version field here?
> > >
> > > Shouldn't the Admin(Client) function have some way to get the min and
> max
> > > information that we're exposing as well?  I guess we could have min,
> max,
> > > and current.  Unrelated: is the use of Long rather than long deliberate
> > > here?
> > >
> > > It would be good to describe how the command line tool
> > > kafka.admin.FeatureCommand will work.  For example the flags that it
> will
> > > take and the output that it will generate to STDOUT.
> > >
> > > cheers,
> > > Colin
> > >
> > >
> > > On Tue, Mar 24, 2020, at 17:08, Kowshik Prakasam wrote:
> > > > Hi all,
> > > >
> > > > I've opened KIP-584 <https://issues.apache.org/jira/browse/KIP-584>
> <https://issues.apache.org/jira/browse/KIP-584>
> > > > which
> > > > is intended to provide a versioning scheme for features. I'd like to
> use
> > > > this thread to discuss the same. I'd appreciate any feedback on this.
> > > > Here
> > > > is a link to KIP-584 <https://issues.apache.org/jira/browse/KIP-584>
> :
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features
> > > >  .
> > > >
> > > > Thank you!
> > > >
> > > >
> > > > Cheers,
> > > > Kowshik
> > > >
> > >
> >
>

Reply via email to