Hi Justine,

I know you started the vote thread, but on re-reading the KIP I noticed
that although the topic id is included in the MetadataResponse it's not
surfaced in the output from `kafka-topics.sh --describe`. Maybe that was
intentional because ids are intentionally not really something the user
should care deeply about, but it would also make life harder for anyone
debugging Kafka and this would likely get worse the more topic ids got
rolled out across the protocols, clients etc. It seems likely that
`kafka-topics.sh` will eventually need the ability to show the id of a
topic and perhaps find a topic name given an id. Is there any reason not to
implement that in this KIP?

Many thanks,

Tom

On Mon, Sep 21, 2020 at 9:54 PM Justine Olshan <jols...@confluent.io> wrote:

> Hi all,
>
> After thinking about it, I've decided to remove the topic name from the
> Fetch Request and Response after all. Since there are so many of these
> requests per second, it is worth removing the extra information. I've
> updated the KIP to reflect this change.
>
> Please let me know if there is anything else we should discuss before
> voting.
>
> Thank you,
> Justine
>
> On Fri, Sep 18, 2020 at 9:46 AM Justine Olshan <jols...@confluent.io>
> wrote:
>
> > Hi Jun,
> >
> > I see what you are saying. For now we can remove the extra information.
> > I'll leave the option to add more fields to the file in the future. The
> KIP
> > has been updated to reflect this change.
> >
> > Thanks,
> > Justine
> >
> > On Fri, Sep 18, 2020 at 8:46 AM Jun Rao <j...@confluent.io> wrote:
> >
> >> Hi, Justine,
> >>
> >> Thanks for the reply.
> >>
> >> 13. If the log directory is the source of truth, it means that the
> >> redundant info in the metadata file will be ignored. Then the question
> is
> >> why do we need to put the redundant info in the metadata file now?
> >>
> >> Thanks,
> >>
> >> Jun
> >>
> >> On Thu, Sep 17, 2020 at 5:07 PM Justine Olshan <jols...@confluent.io>
> >> wrote:
> >>
> >> > Hi Jun,
> >> > Thanks for the quick response!
> >> >
> >> > 12. I've decided to bump up the versions on the requests and updated
> the
> >> > KIP. I think it's good we thoroughly discussed the options here, so we
> >> know
> >> > we made a good choice. :)
> >> >
> >> > 13. This is an interesting situation. I think if this does occur we
> >> should
> >> > give a warning. I agree that it's hard to know the source of truth for
> >> sure
> >> > since the directory or the file could be manually modified. I guess
> the
> >> > directory could be used as the source of truth. To be honest, I'm not
> >> > really sure what happens in kafka when the log directory is renamed
> >> > manually in such a way. I'm also wondering if the situation is
> >> recoverable
> >> > in this scenario.
> >> >
> >> > Thanks,
> >> > Justine
> >> >
> >> > On Thu, Sep 17, 2020 at 4:28 PM Jun Rao <j...@confluent.io> wrote:
> >> >
> >> > > Hi, Justine,
> >> > >
> >> > > Thanks for the reply.
> >> > >
> >> > > 12. I don't have a strong preference either. However, if we need IBP
> >> > > anyway, maybe it's easier to just bump up the version for all inter
> >> > broker
> >> > > requests and add the topic id field as a regular field. A regular
> >> field
> >> > is
> >> > > a bit more concise in wire transfer than a flexible field.
> >> > >
> >> > > 13. The confusion that I was referring to is between the topic name
> >> and
> >> > > partition number between the log dir and the metadata file. For
> >> example,
> >> > if
> >> > > the log dir is topicA-1 and the metadata file in it has topicB and
> >> > > partition 0 (say due to a bug or manual modification), which one do
> we
> >> > use
> >> > > as the source of truth?
> >> > >
> >> > > Jun
> >> > >
> >> > > On Thu, Sep 17, 2020 at 3:43 PM Justine Olshan <
> jols...@confluent.io>
> >> > > wrote:
> >> > >
> >> > > > Hi Jun,
> >> > > > Thanks for the comments.
> >> > > >
> >> > > > 12. I bumped the LeaderAndIsrRequest because I removed the topic
> >> name
> >> > > field
> >> > > > in the response. It may be possible to avoid bumping the version
> >> > without
> >> > > > that change, but I may be missing something.
> >> > > > I believe StopReplica is actually on version 3 now, but because
> >> > version 2
> >> > > > is flexible, I kept that listed as version 2 on the KIP page.
> >> However,
> >> > > you
> >> > > > may be right in that we may need to bump the version on
> StopReplica
> >> to
> >> > > deal
> >> > > > with deletion differently as mentioned above. I don't know if I
> >> have a
> >> > > big
> >> > > > preference over used tagged fields or not.
> >> > > >
> >> > > > 13. I was thinking that in the case where the file and the request
> >> > topic
> >> > > > ids don't match, it means that the broker's topic/the one in the
> >> file
> >> > has
> >> > > > been deleted. In that case, we would need to delete the old topic
> >> and
> >> > > start
> >> > > > receiving the new version. If the topic name were to change, but
> the
> >> > ids
> >> > > > still matched, the file would also need to update. Am I missing a
> >> case
> >> > > > where the file would be correct and not the request?
> >> > > >
> >> > > > Thanks,
> >> > > > Justine
> >> > > >
> >> > > > On Thu, Sep 17, 2020 at 3:18 PM Jun Rao <j...@confluent.io> wrote:
> >> > > >
> >> > > > > Hi, Justine,
> >> > > > >
> >> > > > > Thanks for the reply. A couple of more comments below.
> >> > > > >
> >> > > > > 12. ListOffset and OffsetForLeader currently don't support
> >> flexible
> >> > > > fields.
> >> > > > > So, we have to bump up the version number and use IBP at least
> for
> >> > > these
> >> > > > > two requests. Note that it seems 2.7.0 will require IBP anyway
> >> > because
> >> > > of
> >> > > > > changes in KAFKA-10435. Also, it seems that the version for
> >> > > > > LeaderAndIsrRequest and StopReplica are bumped even though we
> only
> >> > > added
> >> > > > a
> >> > > > > tagged field. But since IBP is needed anyway, we may want to
> >> revisit
> >> > > the
> >> > > > > overall tagged field choice.
> >> > > > >
> >> > > > > 13. The only downside is the potential confusion on which one is
> >> the
> >> > > > source
> >> > > > > of truth if they don't match. Another option is to include those
> >> > fields
> >> > > > in
> >> > > > > the metadata file when we actually change the directory
> structure.
> >> > > > >
> >> > > > > Thanks,
> >> > > > >
> >> > > > > Jun
> >> > > > >
> >> > > > > On Thu, Sep 17, 2020 at 2:01 PM Justine Olshan <
> >> jols...@confluent.io
> >> > >
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Hello all,
> >> > > > > >
> >> > > > > > I've thought some more about removing the topic name field
> from
> >> > some
> >> > > of
> >> > > > > the
> >> > > > > > requests. On closer inspection of the requests/responses, it
> >> seems
> >> > > that
> >> > > > > the
> >> > > > > > internal changes would be much larger than I expected. Some
> >> > protocols
> >> > > > > > involve clients, so they would require changes too. I'm
> thinking
> >> > that
> >> > > > for
> >> > > > > > now, removing the topic name from these requests and responses
> >> are
> >> > > out
> >> > > > of
> >> > > > > > scope.
> >> > > > > >
> >> > > > > > I have decided to just keep the change LeaderAndIsrResponse to
> >> > remove
> >> > > > the
> >> > > > > > topic name, and have updated the KIP to reflect this change. I
> >> have
> >> > > > also
> >> > > > > > mentioned the other requests and responses in future work.
> >> > > > > >
> >> > > > > > I'm hoping to start the voting process soon, so let me know if
> >> > there
> >> > > is
> >> > > > > > anything else we should discuss.
> >> > > > > >
> >> > > > > > Thank you,
> >> > > > > > Justine
> >> > > > > >
> >> > > > > > On Tue, Sep 15, 2020 at 3:57 PM Justine Olshan <
> >> > jols...@confluent.io
> >> > > >
> >> > > > > > wrote:
> >> > > > > >
> >> > > > > > > Hello again,
> >> > > > > > > To follow up on some of the other comments:
> >> > > > > > >
> >> > > > > > > 10/11) We can remove the topic name from these
> >> > requests/responses,
> >> > > > and
> >> > > > > > > that means we will just have to make a few internal changes
> to
> >> > make
> >> > > > > > > partitions accessible by topic id and partition. I can
> update
> >> the
> >> > > KIP
> >> > > > > to
> >> > > > > > > remove them unless anyone thinks they should stay.
> >> > > > > > >
> >> > > > > > > 12) Addressed in the previous email. I've updated the KIP to
> >> > > include
> >> > > > > > > tagged fields for the requests and responses. (More on that
> >> > below)
> >> > > > > > >
> >> > > > > > > 13) I think part of the idea for including this information
> >> is to
> >> > > > > prepare
> >> > > > > > > for future changes. Perhaps the directory structure might
> >> change
> >> > > from
> >> > > > > > > topicName_partitionNumber to something like
> >> > > topicID_partitionNumber.
> >> > > > > Then
> >> > > > > > > it would be useful to have the topic name in the file since
> it
> >> > > would
> >> > > > > not
> >> > > > > > be
> >> > > > > > > in the directory structure. Supporting topic renames might
> be
> >> > > easier
> >> > > > if
> >> > > > > > the
> >> > > > > > > other fields are included. Would there be any downsides to
> >> > > including
> >> > > > > this
> >> > > > > > > information?
> >> > > > > > >
> >> > > > > > > 14)  Yes, we would need to copy the partition metadata file
> in
> >> > this
> >> > > > > > > process. I've updated the KIP to include this.
> >> > > > > > >
> >> > > > > > > 15) I believe Lucas meant v1 and v2 here. He was referring
> to
> >> how
> >> > > the
> >> > > > > > > requests would fall under different IBP and meant that older
> >> > > brokers
> >> > > > > > would
> >> > > > > > > have to use the older version of the request and the
> existing
> >> > topic
> >> > > > > > > deletion process. At first, it seemed like tagged fields
> would
> >> > > > resolve
> >> > > > > > > the IBP issue. However, we may need IBP for this request
> after
> >> > all
> >> > > > > since
> >> > > > > > > the controller handles the topic deletion differently
> >> depending
> >> > on
> >> > > > the
> >> > > > > > IBP
> >> > > > > > > version. In an older version, we can't just send a
> StopReplica
> >> > > delete
> >> > > > > the
> >> > > > > > > topic immediately like we'd want to for this KIP.
> >> > > > > > >
> >> > > > > > > This makes me wonder if we want tagged fields on all the
> >> requests
> >> > > > after
> >> > > > > > > all. Let me know your thoughts!
> >> > > > > > >
> >> > > > > > > Justine
> >> > > > > > >
> >> > > > > > > On Tue, Sep 15, 2020 at 1:03 PM Justine Olshan <
> >> > > jols...@confluent.io
> >> > > > >
> >> > > > > > > wrote:
> >> > > > > > >
> >> > > > > > >> Hi all,
> >> > > > > > >> Jun brought up a good point in his last email about tagged
> >> > fields,
> >> > > > and
> >> > > > > > >> I've updated the KIP to reflect that the changes to
> requests
> >> and
> >> > > > > > responses
> >> > > > > > >> will be in the form of tagged fields to avoid changing IBP.
> >> > > > > > >>
> >> > > > > > >> Jun: I plan on sending a followup email to address some of
> >> the
> >> > > other
> >> > > > > > >> points.
> >> > > > > > >>
> >> > > > > > >> Thanks,
> >> > > > > > >> Justine
> >> > > > > > >>
> >> > > > > > >> On Mon, Sep 14, 2020 at 4:25 PM Jun Rao <j...@confluent.io>
> >> > wrote:
> >> > > > > > >>
> >> > > > > > >>> Hi, Justine,
> >> > > > > > >>>
> >> > > > > > >>> Thanks for the updated KIP. A few comments below.
> >> > > > > > >>>
> >> > > > > > >>> 10. LeaderAndIsr Response: Do we need the topic name?
> >> > > > > > >>>
> >> > > > > > >>> 11. For the changed request/response, other than
> >> LeaderAndIsr,
> >> > > > > > >>> UpdateMetadata, Metadata, do we need to include the topic
> >> name?
> >> > > > > > >>>
> >> > > > > > >>> 12. It seems that upgrades don't require IBP. Does that
> mean
> >> > the
> >> > > > new
> >> > > > > > >>> fields
> >> > > > > > >>> in all the request/response are added as tagged fields
> >> without
> >> > > > > bumping
> >> > > > > > up
> >> > > > > > >>> the request version? It would be useful to make that
> clear.
> >> > > > > > >>>
> >> > > > > > >>> 13. Partition Metadata file: Do we need to include the
> topic
> >> > name
> >> > > > and
> >> > > > > > the
> >> > > > > > >>> partition id since they are implied in the directory name?
> >> > > > > > >>>
> >> > > > > > >>> 14. In the JBOD mode, we support moving a partition's data
> >> from
> >> > > one
> >> > > > > > disk
> >> > > > > > >>> to
> >> > > > > > >>> another. Will the new partition metadata file be copied
> >> during
> >> > > that
> >> > > > > > >>> process?
> >> > > > > > >>>
> >> > > > > > >>> 15. The KIP says "Remove deleted topics from replicas by
> >> > sending
> >> > > > > > >>> StopReplicaRequest V2 for any topics which do not contain
> a
> >> > topic
> >> > > > ID,
> >> > > > > > and
> >> > > > > > >>> V3 for any topics which do contain a topic ID.". However,
> it
> >> > > seems
> >> > > > > the
> >> > > > > > >>> updated controller will create all missing topic IDs first
> >> > before
> >> > > > > doing
> >> > > > > > >>> other actions. So, is StopReplicaRequest V2 needed?
> >> > > > > > >>>
> >> > > > > > >>> Jun
> >> > > > > > >>>
> >> > > > > > >>> On Fri, Sep 11, 2020 at 10:31 AM John Roesler <
> >> > > vvcep...@apache.org
> >> > > > >
> >> > > > > > >>> wrote:
> >> > > > > > >>>
> >> > > > > > >>> > Thanks, Justine!
> >> > > > > > >>> >
> >> > > > > > >>> > Your response seems compelling to me.
> >> > > > > > >>> >
> >> > > > > > >>> > -John
> >> > > > > > >>> >
> >> > > > > > >>> > On Fri, 2020-09-11 at 10:17 -0700, Justine Olshan wrote:
> >> > > > > > >>> > > Hello all,
> >> > > > > > >>> > > Thanks for continuing the discussion! I have a few
> >> > responses
> >> > > to
> >> > > > > > your
> >> > > > > > >>> > points.
> >> > > > > > >>> > >
> >> > > > > > >>> > > Tom: You are correct in that this KIP has not
> mentioned
> >> the
> >> > > > > > >>> > > DeleteTopicsRequest. I think that this would be out of
> >> > scope
> >> > > > for
> >> > > > > > >>> now, but
> >> > > > > > >>> > > may be something worth adding in the future.
> >> > > > > > >>> > >
> >> > > > > > >>> > > John: We did consider sequence ids, but there are a
> few
> >> > > reasons
> >> > > > > to
> >> > > > > > >>> favor
> >> > > > > > >>> > > UUIDs. There are several cases where topics from
> >> different
> >> > > > > clusters
> >> > > > > > >>> may
> >> > > > > > >>> > > interact now and in the future. For example, Mirror
> >> Maker 2
> >> > > may
> >> > > > > > >>> benefit
> >> > > > > > >>> > > from being able to detect when a cluster being
> mirrored
> >> is
> >> > > > > deleted
> >> > > > > > >>> and
> >> > > > > > >>> > > recreated and globally unique identifiers would make
> >> > > resolving
> >> > > > > > issues
> >> > > > > > >>> > > easier than sequence IDs which may collide between
> >> > clusters.
> >> > > > > > KIP-405
> >> > > > > > >>> > > (tiered storage) will also benefit from globally
> unique
> >> IDs
> >> > > as
> >> > > > > > shared
> >> > > > > > >>> > > buckets may be used between clusters.
> >> > > > > > >>> > >
> >> > > > > > >>> > > Globally unique IDs would also make functionality like
> >> > moving
> >> > > > > > topics
> >> > > > > > >>> > > between disparate clusters easier in the future,
> >> simplify
> >> > any
> >> > > > > > future
> >> > > > > > >>> > > implementations of backups and restores, and more. In
> >> > > general,
> >> > > > > > >>> unique IDs
> >> > > > > > >>> > > would ensure that the source cluster topics do not
> >> conflict
> >> > > > with
> >> > > > > > the
> >> > > > > > >>> > > destination cluster topics.
> >> > > > > > >>> > >
> >> > > > > > >>> > > If we were to use sequence ids, we would need
> >> sufficiently
> >> > > > large
> >> > > > > > >>> cluster
> >> > > > > > >>> > > ids to be stored with the topic identifiers or we run
> >> the
> >> > > risk
> >> > > > of
> >> > > > > > >>> > > collisions. This will give up any advantage in
> >> compactness
> >> > > that
> >> > > > > > >>> sequence
> >> > > > > > >>> > > numbers may bring. Given these advantages I think it
> >> makes
> >> > > > sense
> >> > > > > to
> >> > > > > > >>> use
> >> > > > > > >>> > > UUIDs.
> >> > > > > > >>> > >
> >> > > > > > >>> > > Gokul: This is an interesting idea, but this is a
> >> breaking
> >> > > > > change.
> >> > > > > > >>> Out of
> >> > > > > > >>> > > scope for now, but maybe worth discussing in the
> future.
> >> > > > > > >>> > >
> >> > > > > > >>> > > Hope this explains some of the decisions,
> >> > > > > > >>> > >
> >> > > > > > >>> > > Justine
> >> > > > > > >>> > >
> >> > > > > > >>> > >
> >> > > > > > >>> > >
> >> > > > > > >>> > > On Fri, Sep 11, 2020 at 8:27 AM Gokul Ramanan
> >> Subramanian <
> >> > > > > > >>> > > gokul24...@gmail.com> wrote:
> >> > > > > > >>> > >
> >> > > > > > >>> > > > Hi.
> >> > > > > > >>> > > >
> >> > > > > > >>> > > > Thanks for the KIP.
> >> > > > > > >>> > > >
> >> > > > > > >>> > > > Have you thought about whether it makes sense to
> >> support
> >> > > > > > >>> authorizing a
> >> > > > > > >>> > > > principal for a topic ID rather than a topic name to
> >> > > achieve
> >> > > > > > >>> tighter
> >> > > > > > >>> > > > security?
> >> > > > > > >>> > > >
> >> > > > > > >>> > > > Or is the topic ID fundamentally an internal detail
> >> > similar
> >> > > > to
> >> > > > > > >>> epochs
> >> > > > > > >>> > used
> >> > > > > > >>> > > > in a bunch of other places in Kafka?
> >> > > > > > >>> > > >
> >> > > > > > >>> > > > Thanks.
> >> > > > > > >>> > > >
> >> > > > > > >>> > > > On Fri, Sep 11, 2020 at 4:06 PM John Roesler <
> >> > > > > > vvcep...@apache.org>
> >> > > > > > >>> > wrote:
> >> > > > > > >>> > > >
> >> > > > > > >>> > > > > Hello Justine,
> >> > > > > > >>> > > > >
> >> > > > > > >>> > > > > Thanks for the KIP!
> >> > > > > > >>> > > > >
> >> > > > > > >>> > > > > I happen to have been confronted recently with the
> >> need
> >> > > to
> >> > > > > keep
> >> > > > > > >>> > track of
> >> > > > > > >>> > > > a
> >> > > > > > >>> > > > > large number of topics as compactly as possible. I
> >> was
> >> > > > going
> >> > > > > to
> >> > > > > > >>> come
> >> > > > > > >>> > up
> >> > > > > > >>> > > > > with some way to dictionary encode the topic names
> >> as
> >> > > > > integers,
> >> > > > > > >>> but
> >> > > > > > >>> > this
> >> > > > > > >>> > > > > seems much better!
> >> > > > > > >>> > > > >
> >> > > > > > >>> > > > > Apologies if this has been raised before, but I’m
> >> > > wondering
> >> > > > > > >>> about the
> >> > > > > > >>> > > > > choice of UUID vs sequence number for the ids.
> >> > Typically,
> >> > > > > I’ve
> >> > > > > > >>> seen
> >> > > > > > >>> > UUIDs
> >> > > > > > >>> > > > > in two situations:
> >> > > > > > >>> > > > > 1. When processes need to generate non-colliding
> >> > > > identifiers
> >> > > > > > >>> without
> >> > > > > > >>> > > > > coordination.
> >> > > > > > >>> > > > > 2. When the identifier needs to be “universally
> >> > unique”;
> >> > > > > I.e.,
> >> > > > > > >>> the
> >> > > > > > >>> > > > > identifier needs to distinguish the entity from
> all
> >> > other
> >> > > > > > >>> entities
> >> > > > > > >>> > that
> >> > > > > > >>> > > > > could ever exist. This is useful in cases where
> >> > entities
> >> > > > from
> >> > > > > > all
> >> > > > > > >>> > kinds
> >> > > > > > >>> > > > of
> >> > > > > > >>> > > > > systems get mixed together, such as when dumping
> >> logs
> >> > > from
> >> > > > > all
> >> > > > > > >>> > processes
> >> > > > > > >>> > > > in
> >> > > > > > >>> > > > > a company into a common system.
> >> > > > > > >>> > > > >
> >> > > > > > >>> > > > > Maybe I’m being short-sighted, but it doesn’t seem
> >> like
> >> > > > > either
> >> > > > > > >>> really
> >> > > > > > >>> > > > > applies here. It seems like the brokers could and
> >> would
> >> > > > > achieve
> >> > > > > > >>> > consensus
> >> > > > > > >>> > > > > when creating a topic anyway, which is all that’s
> >> > > required
> >> > > > to
> >> > > > > > >>> > generate
> >> > > > > > >>> > > > > non-colliding sequence ids. For the second, as you
> >> > > mention,
> >> > > > > we
> >> > > > > > >>> could
> >> > > > > > >>> > > > assign
> >> > > > > > >>> > > > > a UUID to the cluster as a whole, which would
> render
> >> > any
> >> > > > > > resource
> >> > > > > > >>> > scoped
> >> > > > > > >>> > > > to
> >> > > > > > >>> > > > > the broker universally unique as well.
> >> > > > > > >>> > > > >
> >> > > > > > >>> > > > > The reason I mention this is that, although a UUID
> >> is
> >> > way
> >> > > > > more
> >> > > > > > >>> > compact
> >> > > > > > >>> > > > > than topic names, it’s still 16 bytes. In
> contrast,
> >> a
> >> > > > 4-byte
> >> > > > > > >>> integer
> >> > > > > > >>> > > > > sequence id would give us 4 billion unique topics
> >> per
> >> > > > > cluster,
> >> > > > > > >>> which
> >> > > > > > >>> > > > seems
> >> > > > > > >>> > > > > like enough ;)
> >> > > > > > >>> > > > >
> >> > > > > > >>> > > > > Considering the number of different times these
> >> topic
> >> > > > > > >>> identifiers are
> >> > > > > > >>> > > > sent
> >> > > > > > >>> > > > > over the wire or stored in memory, it seems like
> it
> >> > might
> >> > > > be
> >> > > > > > >>> worth
> >> > > > > > >>> > the
> >> > > > > > >>> > > > > additional 4x space savings.
> >> > > > > > >>> > > > >
> >> > > > > > >>> > > > > What do you think about this?
> >> > > > > > >>> > > > >
> >> > > > > > >>> > > > > Thanks,
> >> > > > > > >>> > > > > John
> >> > > > > > >>> > > > >
> >> > > > > > >>> > > > > On Fri, Sep 11, 2020, at 03:20, Tom Bentley wrote:
> >> > > > > > >>> > > > > > Hi Justine,
> >> > > > > > >>> > > > > >
> >> > > > > > >>> > > > > > This looks like a very welcome improvement.
> >> Thanks!
> >> > > > > > >>> > > > > >
> >> > > > > > >>> > > > > > Maybe I missed it, but the KIP doesn't seem to
> >> > mention
> >> > > > > > changing
> >> > > > > > >>> > > > > > DeleteTopicsRequest to identify the topic using
> an
> >> > id.
> >> > > > > Maybe
> >> > > > > > >>> > that's out
> >> > > > > > >>> > > > > of
> >> > > > > > >>> > > > > > scope, but DeleteTopicsRequest is not listed
> among
> >> > the
> >> > > > > Future
> >> > > > > > >>> Work
> >> > > > > > >>> > APIs
> >> > > > > > >>> > > > > > either.
> >> > > > > > >>> > > > > >
> >> > > > > > >>> > > > > > Kind regards,
> >> > > > > > >>> > > > > >
> >> > > > > > >>> > > > > > Tom
> >> > > > > > >>> > > > > >
> >> > > > > > >>> > > > > > On Thu, Sep 10, 2020 at 3:59 PM Satish Duggana <
> >> > > > > > >>> > > > satish.dugg...@gmail.com
> >> > > > > > >>> > > > > > wrote:
> >> > > > > > >>> > > > > >
> >> > > > > > >>> > > > > > > Thanks Lucas/Justine for the nice KIP.
> >> > > > > > >>> > > > > > >
> >> > > > > > >>> > > > > > > It has several benefits which also include
> >> > > simplifying
> >> > > > > the
> >> > > > > > >>> topic
> >> > > > > > >>> > > > > > > deletion process by controller and logs
> cleanup
> >> by
> >> > > > > brokers
> >> > > > > > in
> >> > > > > > >>> > corner
> >> > > > > > >>> > > > > > > cases.
> >> > > > > > >>> > > > > > >
> >> > > > > > >>> > > > > > > Best,
> >> > > > > > >>> > > > > > > Satish.
> >> > > > > > >>> > > > > > >
> >> > > > > > >>> > > > > > > On Wed, Sep 9, 2020 at 10:07 PM Justine
> Olshan <
> >> > > > > > >>> > jols...@confluent.io
> >> > > > > > >>> > > > > > > wrote:
> >> > > > > > >>> > > > > > > > Hello all, it's been almost a year! I've
> made
> >> > some
> >> > > > > > changes
> >> > > > > > >>> to
> >> > > > > > >>> > this
> >> > > > > > >>> > > > > KIP
> >> > > > > > >>> > > > > > > and hope to continue the discussion.
> >> > > > > > >>> > > > > > > > One of the main changes I've added is now
> the
> >> > > > metadata
> >> > > > > > >>> response
> >> > > > > > >>> > > > will
> >> > > > > > >>> > > > > > > include the topic ID (as Colin suggested).
> >> Clients
> >> > > can
> >> > > > > > >>> obtain the
> >> > > > > > >>> > > > > topicID
> >> > > > > > >>> > > > > > > of a given topic through a TopicDescription.
> The
> >> > > > topicId
> >> > > > > > will
> >> > > > > > >>> > also be
> >> > > > > > >>> > > > > > > included with the UpdateMetadata request.
> >> > > > > > >>> > > > > > > > Let me know what you all think.
> >> > > > > > >>> > > > > > > > Thank you,
> >> > > > > > >>> > > > > > > > Justine
> >> > > > > > >>> > > > > > > >
> >> > > > > > >>> > > > > > > > On 2019/09/13 16:38:26, "Colin McCabe" <
> >> > > > > > cmcc...@apache.org
> >> > > > > > >>> >
> >> > > > > > >>> > wrote:
> >> > > > > > >>> > > > > > > > > Hi Lucas,
> >> > > > > > >>> > > > > > > > >
> >> > > > > > >>> > > > > > > > > Thanks for tackling this.  Topic IDs are a
> >> > great
> >> > > > > idea,
> >> > > > > > >>> and
> >> > > > > > >>> > this
> >> > > > > > >>> > > > is
> >> > > > > > >>> > > > > a
> >> > > > > > >>> > > > > > > really good writeup.
> >> > > > > > >>> > > > > > > > > For /brokers/topics/[topic], the schema
> >> version
> >> > > > > should
> >> > > > > > be
> >> > > > > > >>> > bumped
> >> > > > > > >>> > > > to
> >> > > > > > >>> > > > > > > version 3, rather than 2.  KIP-455 bumped the
> >> > version
> >> > > > of
> >> > > > > > this
> >> > > > > > >>> > znode
> >> > > > > > >>> > > > to
> >> > > > > > >>> > > > > 2
> >> > > > > > >>> > > > > > > already :)
> >> > > > > > >>> > > > > > > > > Given that we're going to be seeing these
> >> > things
> >> > > as
> >> > > > > > >>> strings
> >> > > > > > >>> > as
> >> > > > > > >>> > > > lot
> >> > > > > > >>> > > > > (in
> >> > > > > > >>> > > > > > > logs, in ZooKeeper, on the command-line,
> etc.),
> >> > does
> >> > > it
> >> > > > > > make
> >> > > > > > >>> > sense to
> >> > > > > > >>> > > > > use
> >> > > > > > >>> > > > > > > base64 when converting them to strings?
> >> > > > > > >>> > > > > > > > > Here is an example of the hex
> >> representation:
> >> > > > > > >>> > > > > > > > > 6fcb514b-b878-4c9d-95b7-8dc3a7ce6fd8
> >> > > > > > >>> > > > > > > > >
> >> > > > > > >>> > > > > > > > > And here is an example in base64.
> >> > > > > > >>> > > > > > > > > b8tRS7h4TJ2Vt43Dp85v2A
> >> > > > > > >>> > > > > > > > >
> >> > > > > > >>> > > > > > > > > The base64 version saves 15 letters (to be
> >> > fair,
> >> > > 4
> >> > > > of
> >> > > > > > >>> those
> >> > > > > > >>> > were
> >> > > > > > >>> > > > > > > dashes that we could have elided in the hex
> >> > > > > > representation.)
> >> > > > > > >>> > > > > > > > > Another thing to consider is that we
> should
> >> > > specify
> >> > > > > > that
> >> > > > > > >>> the
> >> > > > > > >>> > > > > > > all-zeroes UUID is not a valid topic UUID.
>  We
> >> > can't
> >> > > > use
> >> > > > > > >>> null
> >> > > > > > >>> > for
> >> > > > > > >>> > > > this
> >> > > > > > >>> > > > > > > because we can't pass a null UUID over the RPC
> >> > > protocol
> >> > > > > > >>> (there
> >> > > > > > >>> > is no
> >> > > > > > >>> > > > > > > special pattern for null, nor do we want to
> >> waste
> >> > > space
> >> > > > > > >>> reserving
> >> > > > > > >>> > > > such
> >> > > > > > >>> > > > > a
> >> > > > > > >>> > > > > > > pattern.)
> >> > > > > > >>> > > > > > > > > Maybe I missed it, but did you describe
> >> > > "migration
> >> > > > > > of...
> >> > > > > > >>> > existing
> >> > > > > > >>> > > > > > > topic[s] without topic IDs" in detail in any
> >> > section?
> >> > > > It
> >> > > > > > >>> seems
> >> > > > > > >>> > like
> >> > > > > > >>> > > > > when
> >> > > > > > >>> > > > > > > the new controller becomes active, it should
> >> just
> >> > > > > generate
> >> > > > > > >>> random
> >> > > > > > >>> > > > > UUIDs for
> >> > > > > > >>> > > > > > > these, and write the random UUIDs back to
> >> > ZooKeeper.
> >> > > > It
> >> > > > > > >>> would be
> >> > > > > > >>> > > > good
> >> > > > > > >>> > > > > to
> >> > > > > > >>> > > > > > > spell that out.  We should make it clear that
> >> this
> >> > > > > happens
> >> > > > > > >>> > regardless
> >> > > > > > >>> > > > > of
> >> > > > > > >>> > > > > > > the inter-broker protocol version (it's a
> >> > compatible
> >> > > > > > change).
> >> > > > > > >>> > > > > > > > > "LeaderAndIsrRequests including an
> >> > > > is_every_partition
> >> > > > > > >>> flag"
> >> > > > > > >>> > > > seems a
> >> > > > > > >>> > > > > > > bit wordy.  Can we just call these "full
> >> > > > > > >>> LeaderAndIsrRequests"?
> >> > > > > > >>> > Then
> >> > > > > > >>> > > > > the
> >> > > > > > >>> > > > > > > RPC field could be named "full".  Also, it
> would
> >> > > > probably
> >> > > > > > be
> >> > > > > > >>> > better
> >> > > > > > >>> > > > > for the
> >> > > > > > >>> > > > > > > RPC field to be an enum of { UNSPECIFIED,
> >> > > INCREMENTAL,
> >> > > > > FULL
> >> > > > > > >>> }, so
> >> > > > > > >>> > > > that
> >> > > > > > >>> > > > > we
> >> > > > > > >>> > > > > > > can cleanly handle old versions (by treating
> >> them
> >> > as
> >> > > > > > >>> UNSPECIFIED)
> >> > > > > > >>> > > > > > > > > In the LeaderAndIsrRequest section, you
> >> write
> >> > "A
> >> > > > > final
> >> > > > > > >>> > deletion
> >> > > > > > >>> > > > > event
> >> > > > > > >>> > > > > > > will be secheduled for X ms after the
> >> > > > LeaderAndIsrRequest
> >> > > > > > was
> >> > > > > > >>> > first
> >> > > > > > >>> > > > > > > received..."  I guess the X was a placeholder
> >> that
> >> > > you
> >> > > > > > >>> intended
> >> > > > > > >>> > to
> >> > > > > > >>> > > > > replace
> >> > > > > > >>> > > > > > > before posting? :)  In any case, this seems
> like
> >> > the
> >> > > > kind
> >> > > > > > of
> >> > > > > > >>> > thing
> >> > > > > > >>> > > > we'd
> >> > > > > > >>> > > > > > > want a configuration for.  Let's describe that
> >> > > > > > configuration
> >> > > > > > >>> key
> >> > > > > > >>> > > > > somewhere
> >> > > > > > >>> > > > > > > in this KIP, including what its default value
> >> is.
> >> > > > > > >>> > > > > > > > > We should probably also log a bunch of
> >> messages
> >> > > at
> >> > > > > WARN
> >> > > > > > >>> level
> >> > > > > > >>> > > > when
> >> > > > > > >>> > > > > > > something is scheduled for deletion, as well.
> >> > (Maybe
> >> > > > > this
> >> > > > > > >>> was
> >> > > > > > >>> > > > > assumed, but
> >> > > > > > >>> > > > > > > it would be good to mention it).
> >> > > > > > >>> > > > > > > > > I feel like there are a few sections that
> >> > should
> >> > > be
> >> > > > > > >>> moved to
> >> > > > > > >>> > > > > "rejected
> >> > > > > > >>> > > > > > > alternatives."  For example, in the
> DeleteTopics
> >> > > > section,
> >> > > > > > >>> since
> >> > > > > > >>> > we're
> >> > > > > > >>> > > > > not
> >> > > > > > >>> > > > > > > going to do option 1 or 2, these should be
> moved
> >> > into
> >> > > > > > >>> "rejected
> >> > > > > > >>> > > > > > > alternatives,"  rather than appearing inline.
> >> > > Another
> >> > > > > case
> >> > > > > > >>> is
> >> > > > > > >>> > the
> >> > > > > > >>> > > > > "Should
> >> > > > > > >>> > > > > > > we remove topic name from the protocol where
> >> > > possible"
> >> > > > > > >>> section.
> >> > > > > > >>> > This
> >> > > > > > >>> > > > > is
> >> > > > > > >>> > > > > > > clearly discussing a design alternative that
> >> we're
> >> > > not
> >> > > > > > >>> proposing
> >> > > > > > >>> > to
> >> > > > > > >>> > > > > > > implement: removing the topic name from those
> >> > > > protocols.
> >> > > > > > >>> > > > > > > > > Is it really necessary to have a new
> >> > > > > > >>> > /admin/delete_topics_by_id
> >> > > > > > >>> > > > > path
> >> > > > > > >>> > > > > > > in ZooKeeper?  It seems like we don't really
> >> need
> >> > > this.
> >> > > > > > >>> Whenever
> >> > > > > > >>> > > > > there is
> >> > > > > > >>> > > > > > > a new controller, we'll send out full
> >> > > > > LeaderAndIsrRequests
> >> > > > > > >>> which
> >> > > > > > >>> > will
> >> > > > > > >>> > > > > > > trigger the stale topics to be cleaned up.
>  The
> >> > > active
> >> > > > > > >>> > controller
> >> > > > > > >>> > > > will
> >> > > > > > >>> > > > > > > also send the full LeaderAndIsrRequest to
> >> brokers
> >> > > that
> >> > > > > are
> >> > > > > > >>> just
> >> > > > > > >>> > > > > starting
> >> > > > > > >>> > > > > > > up.    So we don't really need this kind of
> >> > two-phase
> >> > > > > > commit
> >> > > > > > >>> > (send
> >> > > > > > >>> > > > out
> >> > > > > > >>> > > > > > > StopReplicasRequest, get ACKs from all nodes,
> >> > commit
> >> > > by
> >> > > > > > >>> removing
> >> > > > > > >>> > > > > > > /admin/delete_topics node) any more.
> >> > > > > > >>> > > > > > > > > You mention that FetchRequest will now
> >> include
> >> > > UUID
> >> > > > > to
> >> > > > > > >>> avoid
> >> > > > > > >>> > > > issues
> >> > > > > > >>> > > > > > > where requests are made to stale partitions.
> >> > > However,
> >> > > > > > >>> adding a
> >> > > > > > >>> > UUID
> >> > > > > > >>> > > > to
> >> > > > > > >>> > > > > > > MetadataRequest is listed as future work, out
> of
> >> > > scope
> >> > > > > for
> >> > > > > > >>> this
> >> > > > > > >>> > KIP.
> >> > > > > > >>> > > > > How
> >> > > > > > >>> > > > > > > will the client learn what the topic UUID is,
> if
> >> > the
> >> > > > > > metadata
> >> > > > > > >>> > > > response
> >> > > > > > >>> > > > > > > doesn't include that information?  It seems
> like
> >> > > adding
> >> > > > > the
> >> > > > > > >>> UUID
> >> > > > > > >>> > to
> >> > > > > > >>> > > > > > > MetadataResponse would be an improvement here
> >> that
> >> > > > might
> >> > > > > > not
> >> > > > > > >>> be
> >> > > > > > >>> > too
> >> > > > > > >>> > > > > hard to
> >> > > > > > >>> > > > > > > make.
> >> > > > > > >>> > > > > > > > > best,
> >> > > > > > >>> > > > > > > > > Colin
> >> > > > > > >>> > > > > > > > >
> >> > > > > > >>> > > > > > > > >
> >> > > > > > >>> > > > > > > > > On Mon, Sep 9, 2019, at 17:48, Ryanne
> Dolan
> >> > > wrote:
> >> > > > > > >>> > > > > > > > > > Lucas, this would be great. I've run
> into
> >> > > issues
> >> > > > > with
> >> > > > > > >>> > topics
> >> > > > > > >>> > > > > being
> >> > > > > > >>> > > > > > > > > > resurrected accidentally, since a client
> >> > cannot
> >> > > > > > easily
> >> > > > > > >>> > > > > distinguish
> >> > > > > > >>> > > > > > > between
> >> > > > > > >>> > > > > > > > > > a deleted topic and a new topic with the
> >> same
> >> > > > name.
> >> > > > > > I'd
> >> > > > > > >>> > need
> >> > > > > > >>> > > > the
> >> > > > > > >>> > > > > ID
> >> > > > > > >>> > > > > > > > > > accessible from the client to solve that
> >> > issue,
> >> > > > but
> >> > > > > > >>> this
> >> > > > > > >>> > is a
> >> > > > > > >>> > > > > good
> >> > > > > > >>> > > > > > > first
> >> > > > > > >>> > > > > > > > > > step.
> >> > > > > > >>> > > > > > > > > >
> >> > > > > > >>> > > > > > > > > > Ryanne
> >> > > > > > >>> > > > > > > > > >
> >> > > > > > >>> > > > > > > > > > On Wed, Sep 4, 2019 at 1:41 PM Lucas
> >> > > Bradstreet <
> >> > > > > > >>> > > > > lu...@confluent.io>
> >> > > > > > >>> > > > > > > wrote:
> >> > > > > > >>> > > > > > > > > > > Hi all,
> >> > > > > > >>> > > > > > > > > > >
> >> > > > > > >>> > > > > > > > > > > I would like to kick off discussion of
> >> > > KIP-516,
> >> > > > > an
> >> > > > > > >>> > > > > implementation
> >> > > > > > >>> > > > > > > of topic
> >> > > > > > >>> > > > > > > > > > > IDs for Kafka. Topic IDs aim to solve
> >> topic
> >> > > > > > >>> uniqueness
> >> > > > > > >>> > > > > problems in
> >> > > > > > >>> > > > > > > Kafka,
> >> > > > > > >>> > > > > > > > > > > where referring to a topic by name
> >> alone is
> >> > > > > > >>> insufficient.
> >> > > > > > >>> > > > Such
> >> > > > > > >>> > > > > > > cases
> >> > > > > > >>> > > > > > > > > > > include when a topic has been deleted
> >> and
> >> > > > > recreated
> >> > > > > > >>> with
> >> > > > > > >>> > the
> >> > > > > > >>> > > > > same
> >> > > > > > >>> > > > > > > name.
> >> > > > > > >>> > > > > > > > > > > Unique identifiers will help simplify
> >> and
> >> > > > improve
> >> > > > > > >>> Kafka's
> >> > > > > > >>> > > > topic
> >> > > > > > >>> > > > > > > deletion
> >> > > > > > >>> > > > > > > > > > > process, as well as prevent cases
> where
> >> > > brokers
> >> > > > > may
> >> > > > > > >>> > > > incorrectly
> >> > > > > > >>> > > > > > > interact
> >> > > > > > >>> > > > > > > > > > > with stale versions of topics.
> >> > > > > > >>> > > > > > > > > > >
> >> > > > > > >>> > > > > > > > > > >
> >> > > > > > >>> > > > > > > > > > >
> >> > > > > > >>> > > >
> >> > > > > > >>> >
> >> > > > > > >>>
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers
> >> > > > > > >>> > > > > > > > > > > Looking forward to your thoughts.
> >> > > > > > >>> > > > > > > > > > >
> >> > > > > > >>> > > > > > > > > > > Lucas
> >> > > > > > >>> > > > > > > > > > >
> >> > > > > > >>> >
> >> > > > > > >>> >
> >> > > > > > >>>
> >> > > > > > >>
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>

Reply via email to