Hi Jason / Matthias,

I understand your concerns now. Just to clarify my main motivation on
deprecating the old APIs is not only for aesthetics (confess I personally
do have preference on succinct APIs), but to urge people to use the batched
API for better latency when possible --- as stated in the KIP, my
observation is that in practice most callers are actually going to get
committed offsets for more than one partitions, and without deprecating the
old APIs it would be hard for them to realize that the new API does have a
benefit in performance.

This is different from some of the existing APIs though -- e.g., Matthias
mentioned about seek / seekToBeginning / seekToEnd, where only seekToXX
have plurals and seek only have singulars. We can, of course, make seekToXX
with plurals as well just like commitSync/Async, but since seeks are
non-blocking APIs (they rely on the follow-up polling call to talk to the
brokers) either calling it multiple times with one partition each v.s.
calling it one time with a plural makes little difference (still, I'd argue
that today we do not have a same-named function overloaded with both types
:P) On the other hand, committed, commitSync, offsetsForTimes etc blocking
calls are all in the form of plurals except

* committed
* position
* partitionsFor

My rationale was that 1) for consecutive calls of #position, mostly it
would only require a single round-trip to brokers since we are trying to
refresh fetching positions for all partitions anyways, and 2) for
#partitionsFor, theoretically we could also consider to ask for multiple
topics in one call since each blocking call potentially incurs one round
trip, but I did not include it in the scope of this KIP only because I have
not observed too many usage patterns that are commonly calling it
consecutively for multiple topics. At the moment, what I truly want to
"improve" on is the committed calls, as in many cases I've seen it being
called consecutively for multiple topic-partitions.

Therefore, I'm still more inclined to deprecate the old APIs so that we can
enforce people to discover the new batching APIs for efficiency in this
KIP. But if you feel that this compatibility is very crucial to maintain I
could be convinced.


Guozhang

On Tue, Sep 10, 2019 at 10:18 AM Matthias J. Sax <matth...@confluent.io>
wrote:

> Thanks for the KIP Guozhang.
>
> > Another reason is that other functions of KafkaConsumers do not have
> those
> > overloaded functions to be consistent
>
> I tend to agree with Jason about keeping the existing methods. Your
> argument does not seem to hold. I just checked the `Consumer` API, and
> it's mix of overloads:
>
> Methods only talking `Collections`
>
>
> subscribe/assign/commitSync/commitAsyn/pause/resume/offsetsForTimes/beginningOffsets/endOffsets
>
> Method with overload taking `Collections` or as single value:
>
> seek/seekToBeginning/seekToEnd
>
> (those are strictly different methods, but they are semantically related)
>
> Only talking single value:
>
> position/committed/partitionsFor
>
>
> While you are strictly speaking correct, that there is no method with an
> overload for `Collection` and single value, the API mix seems to suggest
> that it might actually be worth to have corresponding overloads for all
> methods instead of sticking to `Collections` only.
>
>
>
> About the return map: I agree that not containing any entry in the map
> is better. It's not only consistent with other APIs but it also avoids
> potential NPEs.
>
>
>
> -Matthias
>
>
> On 9/10/19 10:04 AM, Jason Gustafson wrote:
> >>  I feel it not worth making committed to have both plurals and
> singulars.
> >
> > Not sure I agree. If we had started with these new APIs from the
> beginning,
> > that may have been better, but we already have exposed the singular APIs
> > and users are depending on them. Not sure it's worth breaking
> compatibility
> > just for aesthetics.
> >
> > -Jason
> >
> > On Tue, Sep 10, 2019 at 9:41 AM Guozhang Wang <wangg...@gmail.com>
> wrote:
> >
> >> Thanks Jason!
> >>
> >> On Tue, Sep 10, 2019 at 9:07 AM Jason Gustafson <ja...@confluent.io>
> >> wrote:
> >>
> >>> Hi Guozhang,
> >>>
> >>> I think the motivation for the new API makes sense. I've wanted
> something
> >>> like this in the past. That said, do you think there is a substantial
> >>> benefit from deprecating the old API? I can still see it being
> convenient
> >>> in some cases and it's no real cost to maintain.
> >>>
> >>>
> >> That's a good question.
> >>
> >> Personally I would like to keep a succinct set of APIs out of the box
> and
> >> let users who want more syntax sugars to add themselves as extended
> classes
> >> for example (KafkaConsumer is not a final class).
> >> Another reason is that other functions of KafkaConsumers do not have
> those
> >> overloaded functions to be consistent, e.g. we do not have a
> >> subscribe(single-topic), pause/resume(single-topic-partition) or
> >> seekToBeginning(single-topic-partition). I feel it not worth making
> >> committed to have both plurals and singulars.
> >>
> >> WDYT?
> >>
> >>
> >>> Also, just a minor detail. If the partition has no committed offset,
> will
> >>> it be present in the map with a null value?
> >>>
> >>> I looked into the admin client's listConsumerGroupOffsets call when
> >> creating the KIP, and to be consistent with that API my intention is to
> NOT
> >> include the entry if a topic-partition does not have committed offsets.
> >> That said, if we feel returning an entry with null value is better for
> >> programmability I can also do that (and will update wiki page to
> clarify as
> >> well). LMK.
> >>
> >>
> >>> -Jason
> >>>
> >>> On Tue, Sep 10, 2019 at 6:09 AM Mickael Maison <
> mickael.mai...@gmail.com
> >>>
> >>> wrote:
> >>>
> >>>> +1 (non-binding), thanks Guozhang
> >>>>
> >>>> On Tue, Sep 10, 2019 at 1:14 AM Boyang Chen <
> >> reluctanthero...@gmail.com>
> >>>> wrote:
> >>>>>
> >>>>> Hey Guozhang,
> >>>>>
> >>>>> LGTM, +1 (non-binding)
> >>>>>
> >>>>> On Mon, Sep 9, 2019 at 5:07 PM Guozhang Wang <wangg...@gmail.com>
> >>> wrote:
> >>>>>
> >>>>>> Hello folks,
> >>>>>>
> >>>>>> I've created a new KIP allowing consumer.committed to take a set of
> >>>>>> partitions instead of just one partition to allow batching effects
> >> of
> >>>> such
> >>>>>> requests (the protocol already allows us to send multiple
> >> partitions
> >>>> in one
> >>>>>> round-trip):
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-520%3A+Add+overloaded+Consumer%23committed+for+batching+partitions
> >>>>>>
> >>>>>> Since it is a pretty straight-forward KIP, I'm starting the VOTE
> >> for
> >>>> this
> >>>>>> KIP directly. If there are any suggestions about this proposal,
> >>> please
> >>>> feel
> >>>>>> free to share them in this thread. Thank you!
> >>>>>>
> >>>>>>
> >>>>>> -- Guozhang
> >>>>>>
> >>>>
> >>>
> >>
> >>
> >> --
> >> -- Guozhang
> >>
> >
>
>

-- 
-- Guozhang

Reply via email to