Hi,

An option that is not currently covered in the KIP is to have a separate
config max.block.ms, which is similar to the producer config with the same
name. This came up during the KAFKA-2391 discussion. I think it's clear
that we can't rely on request.timeout.ms, so the decision is between adding
overloads or adding a new config. People seemed to be leaning towards the
latter in KAFKA-2391, but Jason makes a good point that the overloads are
more flexible. A couple of questions from me:

1. Do we need the additional flexibility?
2. If we do, do we need it for every blocking method?

Ismael

On Mon, Mar 19, 2018 at 5:03 PM, Richard Yu <yohan.richard...@gmail.com>
wrote:

> Hi Guozhang,
>
> I made some clarifications to KIP-266, namely:
> 1. Stated more specifically that commitSync will accept user input.
> 2. fetchCommittedOffsets(): Made its role in blocking more clear to the
> reader.
> 3. Sketched what would happen when time limit is exceeded.
>
> These changes should make the KIP easier to understand.
>
> Cheers,
> Richard
>
> On Mon, Mar 19, 2018 at 9:33 AM, Guozhang Wang <wangg...@gmail.com> wrote:
>
> > Hi Richard,
> >
> > I made a pass over the KIP again, some more clarifications / comments:
> >
> > 1. seek() call itself is not blocking, only the following poll() call may
> > be blocking as the actually metadata rq will happen.
> >
> > 2. I saw you did not include Consumer.partitionFor(),
> > Consumer.OffsetAndTimestamp() and Consumer.listTopics() in your KIP.
> After
> > a second thought, I think this may be a better idea to not tackle them in
> > the same KIP, and probably we should consider whether we would change the
> > behavior or not in another discussion. So I agree to not include them.
> >
> > 3. In your wiki you mentioned "Another change shall be made to
> > KafkaConsumer#poll(), due to its call to updateFetchPositions() which
> > blocks indefinitely." This part may a bit obscure to most readers who's
> not
> > familiar with the KafkaConsumer internals, could you please add more
> > elaborations. More specifically, I think the root causes of the public
> APIs
> > mentioned are a bit different while the KIP's explanation sounds like
> they
> > are due to the same reason:
> >
> > 3.1 fetchCommittedOffsets(): this internal call will block forever if the
> > committed offsets cannot be fetched successfully and affect position()
> and
> > committed(). We need to break out of its internal while loop.
> > 3.2 position() itself will while loop when offsets cannot be retrieved in
> > the underlying async call. We need to break out this while loop.
> > 3.3 commitSync() passed Long.MAX_VALUE as the timeout value, we should
> take
> > the user specified timeouts when applicable.
> >
> >
> >
> > Guozhang
> >
> > On Sat, Mar 17, 2018 at 4:44 PM, Richard Yu <yohan.richard...@gmail.com>
> > wrote:
> >
> > > Actually, what I said above is inaccurate. In
> > > testSeekAndCommitWithBrokerFailures, TestUtils.waitUntilTrue blocks,
> not
> > > seek.
> > > My assumption is that seek did not update correctly. I will be digging
> > > further into this.
> > >
> > >
> > >
> > > On Sat, Mar 17, 2018 at 4:16 PM, Richard Yu <
> yohan.richard...@gmail.com>
> > > wrote:
> > >
> > > > One more thing: when looking through tests, I have realized that
> seek()
> > > > methods can potentially block indefinitely. As you well know, seek()
> is
> > > > called when pollOnce() or position() is active. Thus, if position()
> > > blocks
> > > > indefinitely, then so would seek(). Should bounding seek() also be
> > > included
> > > > in this KIP?
> > > >
> > > > Thanks, Richard
> > > >
> > > > On Sat, Mar 17, 2018 at 1:16 PM, Richard Yu <
> > yohan.richard...@gmail.com>
> > > > wrote:
> > > >
> > > >> Thanks for the advice, Jason
> > > >>
> > > >> I have modified KIP-266 to include the java doc for committed() and
> > > other
> > > >> blocking methods, and I also
> > > >> mentioned poll() which will also be bounded. Let me know if there is
> > > >> anything else. :)
> > > >>
> > > >> Sincerely, Richard
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> On Sat, Mar 17, 2018 at 12:00 PM, Jason Gustafson <
> ja...@confluent.io
> > >
> > > >> wrote:
> > > >>
> > > >>> Hi Richard,
> > > >>>
> > > >>> Thanks for the updates. I'm really glad you picked this up. A
> couple
> > > >>> minor
> > > >>> comments:
> > > >>>
> > > >>> 1. Can you list the full set of new APIs explicitly in the KIP?
> > > >>> Currently I
> > > >>> only see the javadoc for `position()`.
> > > >>>
> > > >>> 2. We should consider adding `TimeUnit` to the new methods to avoid
> > > unit
> > > >>> confusion. I know it's inconsistent with the poll() API, but I
> think
> > it
> > > >>> was
> > > >>> probably a mistake not to include it there, so better not to double
> > > down
> > > >>> on
> > > >>> that mistake. And note that we do already have `close(long,
> > TimeUnit)`.
> > > >>>
> > > >>> Other than that, I think the current KIP seems reasonable.
> > > >>>
> > > >>> Thanks,
> > > >>> Jason
> > > >>>
> > > >>> On Wed, Mar 14, 2018 at 5:00 PM, Richard Yu <
> > > yohan.richard...@gmail.com>
> > > >>> wrote:
> > > >>>
> > > >>> > Note to all: I have included bounding commitSync() and
> committed()
> > in
> > > >>> this
> > > >>> > KIP.
> > > >>> >
> > > >>> > On Sun, Mar 11, 2018 at 5:05 PM, Richard Yu <
> > > >>> yohan.richard...@gmail.com>
> > > >>> > wrote:
> > > >>> >
> > > >>> > > Hi all,
> > > >>> > >
> > > >>> > > I updated the KIP where overloading position() is now the
> favored
> > > >>> > approach.
> > > >>> > > Bounding position() using requestTimeoutMs has been listed as
> > > >>> rejected.
> > > >>> > >
> > > >>> > > Any thoughts?
> > > >>> > >
> > > >>> > > On Tue, Mar 6, 2018 at 6:00 PM, Guozhang Wang <
> > wangg...@gmail.com>
> > > >>> > wrote:
> > > >>> > >
> > > >>> > >> I agree that adding the overloads is most flexible. But going
> > for
> > > >>> that
> > > >>> > >> direction we'd do that for all the blocking call that I've
> > listed
> > > >>> above,
> > > >>> > >> with this timeout value covering the end-to-end waiting time.
> > > >>> > >>
> > > >>> > >>
> > > >>> > >> Guozhang
> > > >>> > >>
> > > >>> > >> On Tue, Mar 6, 2018 at 10:02 AM, Ted Yu <yuzhih...@gmail.com>
> > > >>> wrote:
> > > >>> > >>
> > > >>> > >> > bq. The most flexible option is to add overloads to the
> > consumer
> > > >>> > >> >
> > > >>> > >> > This option is flexible.
> > > >>> > >> >
> > > >>> > >> > Looking at the tail of SPARK-18057, Spark dev voiced the
> same
> > > >>> choice.
> > > >>> > >> >
> > > >>> > >> > +1 for adding overload with timeout parameter.
> > > >>> > >> >
> > > >>> > >> > Cheers
> > > >>> > >> >
> > > >>> > >> > On Mon, Mar 5, 2018 at 2:42 PM, Jason Gustafson <
> > > >>> ja...@confluent.io>
> > > >>> > >> > wrote:
> > > >>> > >> >
> > > >>> > >> > > @Guozhang I probably have suggested all options at some
> > point
> > > or
> > > >>> > >> another,
> > > >>> > >> > > including most recently, the current KIP! I was thinking
> > that
> > > >>> > >> practically
> > > >>> > >> > > speaking, the request timeout defines how long the user is
> > > >>> willing
> > > >>> > to
> > > >>> > >> > wait
> > > >>> > >> > > for a response. The consumer doesn't really have a complex
> > > send
> > > >>> > >> process
> > > >>> > >> > > like the producer for any of these APIs, so I wasn't sure
> > how
> > > >>> much
> > > >>> > >> > benefit
> > > >>> > >> > > there would be from having more granular control over
> > timeouts
> > > >>> (in
> > > >>> > the
> > > >>> > >> > end,
> > > >>> > >> > > KIP-91 just adds a single timeout to control the whole
> > send).
> > > >>> That
> > > >>> > >> said,
> > > >>> > >> > it
> > > >>> > >> > > might indeed be better to avoid overloading the config as
> > you
> > > >>> > suggest
> > > >>> > >> > since
> > > >>> > >> > > at least it avoids inconsistency with the producer's
> usage.
> > > >>> > >> > >
> > > >>> > >> > > The most flexible option is to add overloads to the
> consumer
> > > so
> > > >>> that
> > > >>> > >> > users
> > > >>> > >> > > can pass the timeout directly. I'm not sure if that is
> more
> > or
> > > >>> less
> > > >>> > >> > > annoying than a new config, but I've found config
> timeouts a
> > > >>> little
> > > >>> > >> > > constraining in practice. For example, I could imagine
> users
> > > >>> wanting
> > > >>> > >> to
> > > >>> > >> > > wait longer for an offset commit operation than a position
> > > >>> lookup;
> > > >>> > if
> > > >>> > >> the
> > > >>> > >> > > latter isn't timely, users can just pause the partition
> and
> > > >>> continue
> > > >>> > >> > > fetching on others. If you cannot commit offsets, however,
> > it
> > > >>> might
> > > >>> > be
> > > >>> > >> > > safer for an application to wait availability of the
> > > coordinator
> > > >>> > than
> > > >>> > >> > > continuing.
> > > >>> > >> > >
> > > >>> > >> > > -Jason
> > > >>> > >> > >
> > > >>> > >> > > On Sun, Mar 4, 2018 at 10:14 PM, Guozhang Wang <
> > > >>> wangg...@gmail.com>
> > > >>> > >> > wrote:
> > > >>> > >> > >
> > > >>> > >> > > > Hello Richard,
> > > >>> > >> > > >
> > > >>> > >> > > > Thanks for the proposed KIP. I have a couple of general
> > > >>> comments:
> > > >>> > >> > > >
> > > >>> > >> > > > 1. I'm not sure if piggy-backing the timeout exception
> on
> > > the
> > > >>> > >> > > > existing requestTimeoutMs configured in "
> > request.timeout.ms
> > > "
> > > >>> is a
> > > >>> > >> good
> > > >>> > >> > > > idea
> > > >>> > >> > > > since a) it is a general config that applies for all
> types
> > > of
> > > >>> > >> requests,
> > > >>> > >> > > and
> > > >>> > >> > > > 2) using it to cover all the phases of an API call,
> > > including
> > > >>> > >> network
> > > >>> > >> > > round
> > > >>> > >> > > > trip and potential metadata refresh is shown to not be a
> > > good
> > > >>> > idea,
> > > >>> > >> as
> > > >>> > >> > > > illustrated in KIP-91:
> > > >>> > >> > > >
> > > >>> > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > >>> > >> > > > 91+Provide+Intuitive+User+Timeouts+in+The+Producer
> > > >>> > >> > > >
> > > >>> > >> > > > In fact, I think in KAFKA-4879 which is aimed for the
> same
> > > >>> issue
> > > >>> > as
> > > >>> > >> > > > KAFKA-6608,
> > > >>> > >> > > > Jason has suggested we use a new config for the API.
> Maybe
> > > >>> this
> > > >>> > >> would
> > > >>> > >> > be
> > > >>> > >> > > a
> > > >>> > >> > > > more intuitive manner than reusing the
> request.timeout.ms
> > > >>> config.
> > > >>> > >> > > >
> > > >>> > >> > > >
> > > >>> > >> > > > 2. Besides the Consumer.position() call, there are a
> > couple
> > > of
> > > >>> > more
> > > >>> > >> > > > blocking calls today that could result in infinite
> > blocking:
> > > >>> > >> > > > Consumer.commitSync() and Consumer.committed(), should
> > they
> > > be
> > > >>> > >> > considered
> > > >>> > >> > > > in this KIP as well?
> > > >>> > >> > > >
> > > >>> > >> > > > 3. There are a few other APIs that are today relying on
> > > >>> > >> > > request.timeout.ms
> > > >>> > >> > > > already for breaking the infinite blocking, namely
> > > >>> > >> > > Consumer.partitionFor(),
> > > >>> > >> > > > Consumer.OffsetAndTimestamp() and Consumer.listTopics(),
> > if
> > > >>> we are
> > > >>> > >> > making
> > > >>> > >> > > > the other blocking calls to be relying a new config as
> > > >>> suggested
> > > >>> > in
> > > >>> > >> 1)
> > > >>> > >> > > > above, should we also change the semantics of these API
> > > >>> functions
> > > >>> > >> for
> > > >>> > >> > > > consistency?
> > > >>> > >> > > >
> > > >>> > >> > > >
> > > >>> > >> > > > Guozhang
> > > >>> > >> > > >
> > > >>> > >> > > >
> > > >>> > >> > > >
> > > >>> > >> > > >
> > > >>> > >> > > > On Sun, Mar 4, 2018 at 11:13 AM, Richard Yu <
> > > >>> > >> > yohan.richard...@gmail.com>
> > > >>> > >> > > > wrote:
> > > >>> > >> > > >
> > > >>> > >> > > > > Hi all,
> > > >>> > >> > > > >
> > > >>> > >> > > > > I would like to discuss a potential change which would
> > be
> > > >>> made
> > > >>> > to
> > > >>> > >> > > > > KafkaConsumer:
> > > >>> > >> > > > > https://cwiki.apache.org/confluence/pages/viewpage.
> > > >>> > >> > > > action?pageId=75974886
> > > >>> > >> > > > >
> > > >>> > >> > > > > Thanks,
> > > >>> > >> > > > > Richard Yu
> > > >>> > >> > > > >
> > > >>> > >> > > >
> > > >>> > >> > > >
> > > >>> > >> > > >
> > > >>> > >> > > > --
> > > >>> > >> > > > -- Guozhang
> > > >>> > >> > > >
> > > >>> > >> > >
> > > >>> > >> >
> > > >>> > >>
> > > >>> > >>
> > > >>> > >>
> > > >>> > >> --
> > > >>> > >> -- Guozhang
> > > >>> > >>
> > > >>> > >
> > > >>> > >
> > > >>> >
> > > >>>
> > > >>
> > > >>
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>

Reply via email to