Re: [DISCUSS] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

Brian Byrne Wed, 20 Nov 2019 14:21:31 -0800

Hello all,

I've refactored the KIP to remove implementing asynchronous metadata
fetching in the producer during send(). It's now exclusively focused on
reducing the topic metadata fetch payload and proposes adding a new
configuration flag to control topic expiry behavior. Please take a look
when possible.


https://cwiki.apache.org/confluence/display/KAFKA/KIP-526
%3A+Reduce+Producer+Metadata+Lookups+for+Large+Number+of+Topics

Thanks,
Brian

On Fri, Oct 4, 2019 at 10:04 AM Brian Byrne <bby...@confluent.io> wrote:

> Lucas, Guozhang,
>
> Thank you for the comments. Good point on METADATA_MAX_AGE_CONFIG - it
> looks like the ProducerMetadata was differentiating between expiry and
> refresh, but it should be unnecessary to do so once the cost of fetching a
> single topic's metadata is reduced.
>
> I've updated the rejected alternatives and removed the config variables.
>
> Brian
>
> On Fri, Oct 4, 2019 at 9:20 AM Guozhang Wang <wangg...@gmail.com> wrote:
>
>> Hello Brian,
>>
>> Thanks for the KIP.
>>
>> I think using asynchronous metadata update to address 1) metadata update
>> blocking send, but for other issues, currently at producer we do have a
>> configurable `METADATA_MAX_AGE_CONFIG` similar to consumer, by default is
>> 5min. So maybe we do not need to introduce new configs here, but only
>> change the semantics of that config from global expiry (today we just
>> enforce a full metadata update for the whole cluster) to single-topic
>> expiry, and we can also extend its expiry deadline whenever that metadata
>> is successfully used to send a produce request.
>>
>>
>> Guozhang
>>
>>
>>
>> On Thu, Oct 3, 2019 at 6:51 PM Lucas Bradstreet <lu...@confluent.io>
>> wrote:
>>
>> > Hi Brian,
>> >
>> > This looks great, and should help reduce blocking and high metadata
>> request
>> > volumes when the producer is sending to large numbers of topics,
>> especially
>> > at low volumes. I think the approach to make metadata fetching
>> asynchronous
>> > and batch metadata requests together will help significantly.
>> >
>> > The only other approach I can think of is to allow users to supply the
>> > producer with the expected topics upfront, allowing the producer to
>> perform
>> > a single initial metadata request before any sends occur. I see no real
>> > advantages to this approach compared to the async method you’ve
>> proposed,
>> > but maybe we could add it to the rejected alternatives section?
>> >
>> > Thanks,
>> >
>> > Lucas
>> >
>> > On Fri, 20 Sep 2019 at 11:46, Brian Byrne <bby...@confluent.io> wrote:
>> >
>> > > I've updated the 'Proposed Changes' to include two new producer
>> > > configuration variables: topic.expiry.ms and topic.refresh.ms. Please
>> > take
>> > > a look.
>> > >
>> > > Thanks,
>> > > Brian
>> > >
>> > > On Tue, Sep 17, 2019 at 12:59 PM Brian Byrne <bby...@confluent.io>
>> > wrote:
>> > >
>> > > > Dev team,
>> > > >
>> > > > Requesting discussion for improvement to the producer when dealing
>> > with a
>> > > > large number of topics.
>> > > >
>> > > > KIP:
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-526%3A+Reduce+Producer+Metadata+Lookups+for+Large+Number+of+Topics
>> > > >
>> > > > JIRA: https://issues.apache.org/jira/browse/KAFKA-8904
>> > > >
>> > > > Thoughts and feedback would be appreciated.
>> > > >
>> > > > Thanks,
>> > > > Brian
>> > > >
>> > >
>> >
>>
>>
>> --
>> -- Guozhang
>>
>

Re: [DISCUSS] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

Reply via email to