Hi, Justine,

Thanks for the explanation. Your PR made the RecordAccumulator more
cooperative with the new partitioning scheme. So, what I mentioned won't be
an issue. The KIP looks good to me then.

Jun

On Fri, Jul 12, 2019 at 9:02 AM Justine Olshan <jols...@confluent.io> wrote:

> Hello all,
>
> Jun, thanks for taking a look at my KIP! We were also concerned about
> batches containing a single record so we kept this in mind for the
> implementation. The decision to switch the sticky partition actually
> involves returning from the record accumulator and assigning the new
> partition before the new batch is created. That way all of the records will
> go to this new partition's batch. If you would like to get a better look at
> how this works, please check out the PR:
> https://github.com/apache/kafka/pull/6997/files. The most important lines
> are in the append method of the RecordAccumulator and doSend in
> KafkaProducer.
>
> Colin, I think this makes sense to me except for the name
> StickyRoundRobinPartitioner seems to not really explain the behavior of
> what would be implemented. Perhaps a name indicating the sticky behavior is
> always used, or that it will be used on keys is more descriptive. Calling
> it "RoundRobin" seems a bit misleading to me.
>
> Thanks again for reviewing,
> Justine
>
> On Thu, Jul 11, 2019 at 6:07 PM Jun Rao <j...@confluent.io> wrote:
>
> > Hi, Justine,
> >
> > Thanks for the KIP. Nice writeup and great results. Just one comment.
> >
> > 100. To add a record to the accumulator, the producer needs to know the
> > partition id. The decision of whether the record can be added to the
> > current batch is only made after the accumulator.append() call. So, when
> a
> > batch is full, it seems that the KIP will try to append the next record
> to
> > the same partition, which will trigger the creation of a new batch with a
> > single record. After that, new records will be routed to a new partition.
> > If the producer doesn't come back to the first partition in time, the
> > producer will send a single record batch. In the worse case, it can be
> that
> > every other batch has only a single record. Is this correct? If so, could
> > we avoid that?
> >
> > Jun
> >
> > On Thu, Jul 11, 2019 at 5:23 PM Colin McCabe <cmcc...@apache.org> wrote:
> >
> > > Hi Justine,
> > >
> > > I agree that we shouldn't change RoundRobinPartitioner, since its
> > behavior
> > > is already specified.
> > >
> > > However, we could add a new, separate StickyRoundRobinPartitioner class
> > to
> > > KIP-480 which just implemented the sticky behavior regardless of
> whether
> > > the key was null.  That seems pretty easy to add (and it wouldn't have
> to
> > > implemented right away in the first PR, of course.)  It would be an
> > option
> > > for people who wanted to configure this behavior.
> > >
> > > best,
> > > Colin
> > >
> > >
> > > On Wed, Jul 10, 2019, at 08:48, Justine Olshan wrote:
> > > > Hi M,
> > > >
> > > > I'm a little confused by what you mean by extending the behavior on
> to
> > > the
> > > > RoundRobinPartitioner.
> > > > The sticky partitioner plans to remove the round-robin behavior from
> > > > records with no keys. Instead of sending them to each partition in
> > order,
> > > > it sends them all to the same partition until the batch is sent.
> > > > I don't think you can have both round-robin and sticky partition
> > > behavior.
> > > >
> > > > Thank you,
> > > > Justine Olshan
> > > >
> > > > On Wed, Jul 10, 2019 at 1:54 AM M. Manna <manme...@gmail.com> wrote:
> > > >
> > > > > Thanks for the comments Colin.
> > > > >
> > > > > My only concern is that this KIP is addressing a good feature and
> > > having
> > > > > that extended to RoundRobinPartitioner means 1 less KIP in the
> > future.
> > > > >
> > > > > Would it be appropriate to extend the support to
> > RoundRobinPartitioner
> > > too?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > On Tue, 9 Jul 2019 at 17:24, Colin McCabe <cmcc...@apache.org>
> > wrote:
> > > > >
> > > > > > Hi M,
> > > > > >
> > > > > > The RoundRobinPartitioner added by KIP-369 doesn't interact with
> > this
> > > > > > KIP.  If you configure your producer to use
> RoundRobinPartitioner,
> > > then
> > > > > the
> > > > > > DefaultPartitioner will not be used.  And the "sticky" behavior
> is
> > > > > > implemented only in the DefaultPartitioner.
> > > > > >
> > > > > > regards,
> > > > > > Colin
> > > > > >
> > > > > >
> > > > > > On Tue, Jul 9, 2019, at 05:12, M. Manna wrote:
> > > > > > > Hello Justine,
> > > > > > >
> > > > > > > I have one item I wanted to discuss.
> > > > > > >
> > > > > > > We are currently in review stage for KAFKA-3333 where we can
> > choose
> > > > > > always
> > > > > > > RoundRobin regardless of null/usable key.
> > > > > > >
> > > > > > > If I understood this KIP motivation correctly, you are still
> > > honouring
> > > > > > how
> > > > > > > the hashing of key works for DefaultPartitioner. Would you say
> > that
> > > > > > having
> > > > > > > an always "Round-Robin" partitioning with "Sticky" assignment
> > > > > (efficient
> > > > > > > batching of records for a partition) doesn't deviate from your
> > > original
> > > > > > > intention?
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > On Tue, 9 Jul 2019 at 01:00, Justine Olshan <
> > jols...@confluent.io>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hello all,
> > > > > > > >
> > > > > > > > If there are no more comments or concerns, I would like to
> > start
> > > the
> > > > > > vote
> > > > > > > > on this tomorrow afternoon.
> > > > > > > >
> > > > > > > > However, if there are still topics to discuss, feel free to
> > bring
> > > > > them
> > > > > > up
> > > > > > > > now.
> > > > > > > >
> > > > > > > > Thank you,
> > > > > > > > Justine
> > > > > > > >
> > > > > > > > On Tue, Jul 2, 2019 at 4:25 PM Justine Olshan <
> > > jols...@confluent.io>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hello again,
> > > > > > > > >
> > > > > > > > > Another update to the interface has been made to the KIP.
> > > > > > > > > Please let me know if you have any feedback!
> > > > > > > > >
> > > > > > > > > Thank you,
> > > > > > > > > Justine
> > > > > > > > >
> > > > > > > > > On Fri, Jun 28, 2019 at 2:52 PM Justine Olshan <
> > > > > jols...@confluent.io
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > >> Hi all,
> > > > > > > > >> I made some changes to the KIP.
> > > > > > > > >> The idea is to clean up the code, make behavior more
> > explicit,
> > > > > > provide
> > > > > > > > >> more flexibility, and to keep default behavior the same.
> > > > > > > > >>
> > > > > > > > >> Now we will change the partition in onNewBatch, and
> specify
> > > the
> > > > > > > > >> conditions for this function call (non-keyed values, no
> > > explicit
> > > > > > > > >> partitions) in willCallOnNewBatch.
> > > > > > > > >> This clears up some of the issues with the implementation.
> > I'm
> > > > > > happy to
> > > > > > > > >> hear further opinions and discuss this change!
> > > > > > > > >>
> > > > > > > > >> Thank you,
> > > > > > > > >> Justine
> > > > > > > > >>
> > > > > > > > >> On Thu, Jun 27, 2019 at 2:53 PM Colin McCabe <
> > > cmcc...@apache.org>
> > > > > > > > wrote:
> > > > > > > > >>
> > > > > > > > >>> On Thu, Jun 27, 2019, at 01:31, Ismael Juma wrote:
> > > > > > > > >>> > Thanks for the KIP Justine. It looks pretty good. A few
> > > > > comments:
> > > > > > > > >>> >
> > > > > > > > >>> > 1. Should we favor partitions that are not under
> > > replicated?
> > > > > > This is
> > > > > > > > >>> > something that Netflix did too.
> > > > > > > > >>>
> > > > > > > > >>> This seems like it could lead to cascading failures,
> right?
> > > If a
> > > > > > > > >>> partition becomes under-replicated because there is too
> > much
> > > > > > traffic,
> > > > > > > > the
> > > > > > > > >>> producer stops sending to it, which puts even more load
> on
> > > the
> > > > > > > > remaining
> > > > > > > > >>> partitions, which are even more likely to fail then, etc.
> > It
> > > > > also
> > > > > > will
> > > > > > > > >>> create unbalanced load patterns on the consumers.
> > > > > > > > >>>
> > > > > > > > >>> >
> > > > > > > > >>> > 2. If there's no measurable performance difference, I
> > agree
> > > > > with
> > > > > > > > >>> Stanislav
> > > > > > > > >>> > that Optional would be better than Integer.
> > > > > > > > >>> >
> > > > > > > > >>> > 3. We should include the javadoc for the newly
> introduced
> > > > > method
> > > > > > that
> > > > > > > > >>> > specifies it and its parameters. In particular, it
> would
> > > good
> > > > > to
> > > > > > > > >>> specify if
> > > > > > > > >>> > it gets called when an explicit partition id has been
> > > provided.
> > > > > > > > >>>
> > > > > > > > >>> Agreed.
> > > > > > > > >>>
> > > > > > > > >>> best,
> > > > > > > > >>> Colin
> > > > > > > > >>>
> > > > > > > > >>> >
> > > > > > > > >>> > Ismael
> > > > > > > > >>> >
> > > > > > > > >>> > On Mon, Jun 24, 2019, 2:04 PM Justine Olshan <
> > > > > > jols...@confluent.io>
> > > > > > > > >>> wrote:
> > > > > > > > >>> >
> > > > > > > > >>> > > Hello,
> > > > > > > > >>> > > This is the discussion thread for KIP-480: Sticky
> > > > > Partitioner.
> > > > > > > > >>> > >
> > > > > > > > >>> > >
> > > > > > > > >>> > >
> > > > > > > > >>>
> > > > > > > >
> > > > > >
> > > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-480%3A+Sticky+Partitioner
> > > > > > > > >>> > >
> > > > > > > > >>> > > Thank you,
> > > > > > > > >>> > > Justine Olshan
> > > > > > > > >>> > >
> > > > > > > > >>> >
> > > > > > > > >>>
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to