Sounds fair to me. I'm +1 on the KIP. On Thu, Jun 4, 2020 at 5:26 PM Matthias J. Sax <mj...@apache.org> wrote:
> Guozhang, > > what you propose makes sense, but this seems to get into implementation > detail territory already? > > Thus, if there are nor further change requests to the KIP wiki page > itself, I would like to proceed with the VOTE. > > > -Matthias > > On 5/20/20 12:30 PM, Guozhang Wang wrote: > > Thanks Matthias, > > > > I agree with you on all the bullet points above. Regarding the > admin-client > > outer-loop retries inside partition assignor, I think we should treat > error > > codes differently from those two blocking calls: > > > > Describe-topic: > > * unknown-topic (3): add this topic to the to-be-created topic list. > > * leader-not-available (5): do not try to create, retry in the outer > loop. > > * request-timeout: break the current loop and retry in the outer loop. > > * others: fatal error. > > > > Create-topic: > > * topic-already-exists: retry in the outer loop to validate the > > num.partitions match expectation. > > * request-timeout: break the current loop and retry in the outer loop. > > * others: fatal error. > > > > And in the outer-loop, I think we can have a global timer for the whole > > "assign()" function, not only for the internal-topic-manager, and the > timer > > can be hard-coded with, e.g. half of the rebalance.timeout to get rid of > > the `retries`; if we cannot complete the assignment before the timeout > runs > > out, we can return just the partial assignment (e.g. if there are two > > tasks, but we can only get the topic metadata for one of them, then just > do > > the assignment for that one only) while encoding in the error-code field > to > > request for another rebalance. > > > > Guozhang > > > > > > > > On Mon, May 18, 2020 at 7:26 PM Matthias J. Sax <mj...@apache.org> > wrote: > > > >> No worries Guozhang, any feedback is always very welcome! My reply is > >> going to be a little longer... Sorry. > >> > >> > >> > >>> 1) There are some inconsistent statements in the proposal regarding > what > >> to > >>> deprecated: > >> > >> The proposal of the KIP is to deprecate `retries` for producer, admin, > >> and Streams. Maybe the confusion is about the dependency of those > >> settings within Streams and that we handle the deprecation somewhat > >> different for plain clients vs Streams: > >> > >> For plain producer/admin the default `retries` is set to MAX_VALUE. The > >> config will be deprecated but still be respected. > >> > >> For Streams, the default `retries` is set to zero, however, this default > >> retry does _not_ affect the embedded producer/admin clients -- both > >> clients stay on their own default of MAX_VALUES. > >> > >> Currently, this introduces the issue, that if a user wants to increase > >> Streams retries, she might by accident reduce the embedded client > >> retries, too. To avoid this issue, she would need to set > >> > >> retries=100 > >> producer.retires=MAX_VALUE > >> admin.retries=MAX_VALUE > >> > >> This KIP will fix this issue only in the long term though, ie, when > >> `retries` is finally removed. Short term, using `retries` in > >> StreamsConfig would still affect the embedded clients, but Streams, as > >> well as both client would log a WARN message. This preserves backward > >> compatibility. > >> > >> Withing Streams `retries` is ignored and the new `task.timeout.ms` is > >> used instead. This increase the default resilience of Kafka Streams > >> itself. We could also achieve this by still respecting `retries` and to > >> change it's default value. However, because we deprecate `retries` it > >> seems better to just ignore it and switch to the new config directly. > >> > >> I updated the KIPs with some more details. > >> > >> > >> > >>> 2) We should also document the related behavior change in > >> PartitionAssignor > >>> that uses AdminClient. > >> > >> This is actually a good point. Originally, I looked into this only > >> briefly, but it raised an interesting question on how to handle it. > >> > >> Note that `TimeoutExceptions` are currently not handled in this retry > >> loop. Also note that the default retries value for other errors would be > >> MAX_VALUE be default (inherited from `AdminClient#retries` as mentioned > >> already by Guozhang). > >> > >> Applying the new `task.timeout.ms` config does not seem to be > >> appropriate because the AdminClient is used during a rebalance in the > >> leader. We could introduce a new config just for this case, but it seems > >> to be a little bit too much. Furthermore, the group-coordinator applies > >> a timeout on the leader anyway: if the assignment is not computed within > >> the timeout, the leader is removed from the group and another rebalance > >> is triggered. > >> > >> Overall, we make multiple admin client calls and thus we should keep > >> some retry logic and not just rely on the admin client internal retries, > >> as those would fall short to retry different calls interleaved. We could > >> just retry infinitely and rely on the group coordinator to remove the > >> leader eventually. However, this does not seem to be ideal because the > >> removed leader might be stuck forever. > >> > >> The question though is: if topic metadata cannot be obtained or internal > >> topics cannot be created, what should we do? We can't compute an > >> assignment anyway. We have already an rebalance error code to shut down > >> all instances for this case. Maybe we could break the retry loop before > >> the leader is kicked out of the group and send this error code? This way > >> we don't need a new config, but piggy-back on the existing timeout to > >> compute the assignment. To be conservative, we could use a 50% > threshold? > >> > >> > >> > >>> BTW as I mentioned in the previous statement, today throwing an > exception > >>> that kills one thread but not the whole instance is still an issue for > >>> monitoring purposes, but I suppose this is not going to be in this KIP > >> but > >>> addressed by another KIP, right? > >> > >> Correct. This issue if out-of-scope. > >> > >> > >> > >>> for consumer, if it gets a > >>> TimeoutException polling records, would start timing all tasks since > that > >>> single consumer would affect all tasks? > >> > >> Consumer#poll() would never throw a `TimeoutException` and thus > >> `task.timeout.ms` does not apply. > >> > >> > >> > >>> For other blocking calls like > >>> `endOffsets()` etc, they are usually also issued on behalf of a batch > of > >>> tasks, so if that gets timeout exception should we start ticking all > the > >>> corresponding tasks as well? Maybe worth clarifying a bit more in the > >> wiki. > >> > >> Good point. I agree that the timer should tick for all affected tasks. I > >> clarified in the KIP. > >> > >> > >> > >> About KAFKA-6520: > >> > >> There is already KIP-457 and I am not sure this KIP should subsume it. > >> It somewhat feels orthogonal. I am also not 100% sure if KIP-572 > >> actually helps much, because a thread could be disconnected to the > >> brokers without throwing any timeout exception: if all tasks are RUNNING > >> and just polling for new data, but no progress is made because of a > >> network issue, `poll()` would just return no data but not through. > >> > >> > >> > >> @Bruno > >> > >>> Wouldn't it be better to specify > >>> task.timeout.ms to -1 if no retry should be done > >> > >> Interesting idea. Personally I find `-1` confusing. And it seems > >> intuitive to me that `0` implies no retries (this seems to be in > >> alignment to other APIs). > >> > >> > >> > >> -Matthias > >> > >> > >> On 5/18/20 9:53 AM, Guozhang Wang wrote: > >>> Hi Matthias, > >>> > >>> Sorry for flooding the thread, but with this KIP I feel the design > scope > >> of > >>> https://issues.apache.org/jira/browse/KAFKA-6520 can be simplified a > lot > >>> and may it the design can be just piggy-backed as part of this KIP, > wdyt? > >>> > >>> Guozhang > >>> > >>> > >>> On Mon, May 18, 2020 at 9:47 AM Guozhang Wang <wangg...@gmail.com> > >> wrote: > >>> > >>>> Hi Matthias, > >>>> > >>>> Just to add one more meta comment: for consumer, if it gets a > >>>> TimeoutException polling records, would start timing all tasks since > >> that > >>>> single consumer would affect all tasks? For other blocking calls like > >>>> `endOffsets()` etc, they are usually also issued on behalf of a batch > of > >>>> tasks, so if that gets timeout exception should we start ticking all > the > >>>> corresponding tasks as well? Maybe worth clarifying a bit more in the > >> wiki. > >>>> > >>>> > >>>> Guozhang > >>>> > >>>> On Mon, May 18, 2020 at 12:26 AM Bruno Cadonna <br...@confluent.io> > >> wrote: > >>>> > >>>>> Hi Matthias, > >>>>> > >>>>> I am +1 (non-binding) on the KIP. > >>>>> > >>>>> Just one final remark: Wouldn't it be better to specify > >>>>> task.timeout.ms to -1 if no retry should be done? IMO it would make > >>>>> the config more intuitive because 0 would not have two possible > >>>>> meanings (i.e. try once and never try) anymore. > >>>>> > >>>>> Best, > >>>>> Bruno > >>>>> > >>>>> On Sat, May 16, 2020 at 7:51 PM Guozhang Wang <wangg...@gmail.com> > >> wrote: > >>>>>> > >>>>>> Hello Matthias, > >>>>>> > >>>>>> Thanks for the updated KIP, overall I'm +1 on this proposal. Some > >> minor > >>>>>> comments (I know gmail mixed that again for me so I'm leaving it as > a > >>>>> combo > >>>>>> for both DISCUSS and VOTE :) > >>>>>> > >>>>>> 1) There are some inconsistent statements in the proposal regarding > >>>>> what to > >>>>>> deprecated: at the beginning it says "We propose to deprecate the > >>>>> retries > >>>>>> configuration parameter for the producer and admin client" but later > >> in > >>>>>> compatibility we say "Producer and admin client behavior does not > >>>>> change; > >>>>>> both still respect retries config." My understanding is that we will > >>>>> only > >>>>>> deprecate the StreamsConfig#retries while not touch on > >>>>>> ProducerConfig/AdminClientConfig#retries, AND we will always > override > >>>>> the > >>>>>> embedded producer / admin retries config to infinity so that we > never > >>>>> rely > >>>>>> on those configs but always bounded by the timeout config. Is that > >>>>>> right, if yes could you clarify in the doc? > >>>>>> > >>>>>> 2) We should also document the related behavior change in > >>>>> PartitionAssignor > >>>>>> that uses AdminClient. More specifically, the admin client's retries > >>>>> config > >>>>>> is piggy-backed inside InternalTopicManager as an outer-loop retry > >>>>> logic in > >>>>>> addition to AdminClient's own inner retry loop. There are some > >> existing > >>>>>> issues like KAFKA-9999 / 10006 that Sophie and Boyang has been > working > >>>>> on. > >>>>>> I exchanged some ideas with them, and generally we should consider > if > >>>>> the > >>>>>> outer-loop of InternalTopicManager should just be removed and if we > >> got > >>>>>> TimeoutException we should just trigger another rebalance etc. > >>>>>> > >>>>>> BTW as I mentioned in the previous statement, today throwing an > >>>>> exception > >>>>>> that kills one thread but not the whole instance is still an issue > for > >>>>>> monitoring purposes, but I suppose this is not going to be in this > KIP > >>>>> but > >>>>>> addressed by another KIP, right? > >>>>>> > >>>>>> > >>>>>> Guozhang > >>>>>> > >>>>>> > >>>>>> On Fri, May 15, 2020 at 1:14 PM Boyang Chen < > >> reluctanthero...@gmail.com > >>>>>> > >>>>>> wrote: > >>>>>> > >>>>>>> Good, good. > >>>>>>> > >>>>>>> Read through the discussions, the KIP looks good to me, +1 > >>>>> (non-binding) > >>>>>>> > >>>>>>> On Fri, May 15, 2020 at 11:51 AM Sophie Blee-Goldman < > >>>>> sop...@confluent.io> > >>>>>>> wrote: > >>>>>>> > >>>>>>>> Called out! > >>>>>>>> > >>>>>>>> Seems like gmail struggles with [...] prefixed subjects. You'd > >>>>> think they > >>>>>>>> would adapt > >>>>>>>> all their practices to conform to the Apache Kafka mailing list > >>>>>>> standards, > >>>>>>>> but no! > >>>>>>>> > >>>>>>>> +1 (non-binding) by the way > >>>>>>>> > >>>>>>>> On Fri, May 15, 2020 at 11:46 AM John Roesler < > vvcep...@apache.org> > >>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Hi Boyang, > >>>>>>>>> > >>>>>>>>> It is a separate thread, and you have just revealed yourself as a > >>>>> gmail > >>>>>>>>> user ;) > >>>>>>>>> > >>>>>>>>> (Gmail sometimes conflates vote and discuss threads for no > >>>>> apparent > >>>>>>>> reason > >>>>>>>>> ) > >>>>>>>>> > >>>>>>>>> -John > >>>>>>>>> > >>>>>>>>> On Fri, May 15, 2020, at 13:39, Boyang Chen wrote: > >>>>>>>>>> Hey Matthias, should this be on a separate VOTE thread? > >>>>>>>>>> > >>>>>>>>>> On Fri, May 15, 2020 at 11:38 AM John Roesler < > >>>>> vvcep...@apache.org> > >>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Thanks, Matthias! > >>>>>>>>>>> > >>>>>>>>>>> I’m +1 (binding) > >>>>>>>>>>> > >>>>>>>>>>> -John > >>>>>>>>>>> > >>>>>>>>>>> On Fri, May 15, 2020, at 11:55, Matthias J. Sax wrote: > >>>>>>>>>>>> Hi, > >>>>>>>>>>>> > >>>>>>>>>>>> I would like to start the vote on KIP-572: > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>> > >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-572%3A+Improve+timeouts+and+retries+in+Kafka+Streams > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> -Matthias > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Attachments: > >>>>>>>>>>>> * signature.asc > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> -- Guozhang > >>>>> > >>>> > >>>> > >>>> -- > >>>> -- Guozhang > >>>> > >>> > >>> > >> > >> > > > > -- -- Guozhang