Hey Matthias,

Thanks for the comments.

I think when we compact topics, we only delete those messages (when there
is later message with the same key) and we do not change offset of a given
message. As long as offsets of existing messages are not changes, I think
the KIP should still work in the sense that it enforces the property
explained in the Goals section in the updated KIP. Can you see if there is
anything that does not work for compacted topic?

It is true that if we have a topic is not subject to either time-based or
size-based retention, then we can not delete this partition. I don't think
this is necessarily related to the compacted topic. Today we have can have
a non-compacted topic that has super long retention, such that we can not
delete partitions from this topic in a long period. And we can also specify
retention for a compacted topic such that all messages in a partition of
this compacted topic will be deleted after sometime. This KIP ensures that
a partition can be deleted after all messages in the partition has been
deleted, which is the common case. If user wants to keep very old messages
while still deleting partitions, maybe we should have a separate KIP that
allows can merge-sort the existing partitions to new (and smaller) set of
partitions. What do you think?

Regarding older Producer/Consumers, my current understanding is that old
clients can still produce/consume with the current behavior, i.e. consume
may consume messages out of order if there is partition expansion or
deletion. Old clients still have the ordering guarantee if partitions of
the topic does not change. It should be backward compatible. Users need to
upgrade client library in order to use the new future.

Thanks,
Dong

On Thu, Feb 22, 2018 at 6:24 PM, Matthias J. Sax <matth...@confluent.io>
wrote:

> Dong,
>
> thanks a lot for the KIP!
>
> Can you elaborate how this would work for compacted topics? If it does
> not work for compacted topics, I think Streams API cannot allow to scale
> input topics.
>
> This question seems to be particularly interesting for deleting
> partitions: assume that a key is never (or for a very long time)
> updated, a partition cannot be deleted.
>
>
> -Matthias
>
>
> On 2/22/18 5:19 PM, Jay Kreps wrote:
> > Hey Dong,
> >
> > Two questions:
> > 1. How will this work with Streams and Connect?
> > 2. How does this compare to a solution where we physically split
> partitions
> > using a linear hashing approach (the partition number is equivalent to
> the
> > hash bucket in a hash table)? https://en.wikipedia.org/wiki/
> Linear_hashing
> >
> > -Jay
> >
> > On Sat, Feb 10, 2018 at 3:35 PM, Dong Lin <lindon...@gmail.com> wrote:
> >
> >> Hi all,
> >>
> >> I have created KIP-253: Support in-order message delivery with partition
> >> expansion. See
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >> 253%3A+Support+in-order+message+delivery+with+partition+expansion
> >> .
> >>
> >> This KIP provides a way to allow messages of the same key from the same
> >> producer to be consumed in the same order they are produced even if we
> >> expand partition of the topic.
> >>
> >> Thanks,
> >> Dong
> >>
> >
>
>

Reply via email to