Re: [DISCUSS] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

2020-01-02 Thread Guozhang Wang
Cool. Sounds good to me! On Thu, Jan 2, 2020 at 2:21 PM Brian Byrne wrote: > Hi Guozhang, > > Only evicting due to memory pressure (or if the metadata is stale, to > prevent infinitely refreshing no longer used topics when memory's not an > issue) makes sense, and would be the most intuitive

Re: [DISCUSS] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

2020-01-02 Thread Brian Byrne
Hi Guozhang, Only evicting due to memory pressure (or if the metadata is stale, to prevent infinitely refreshing no longer used topics when memory's not an issue) makes sense, and would be the most intuitive way of going about it. >From an implementation perspective, it's a little more difficult

Re: [DISCUSS] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

2020-01-02 Thread Guozhang Wang
On Thu, Jan 2, 2020 at 12:42 PM Brian Byrne wrote: > Hi Guozhang, > > You're correct in that, with the current defined default values, it's > likely we'll be more proactive in refreshing metadata for > once/rarely-touched topics, which isn't ideal. It'd be at most 2x the rate, > but we can

Re: [DISCUSS] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

2020-01-02 Thread Brian Byrne
Hi Guozhang, You're correct in that, with the current defined default values, it's likely we'll be more proactive in refreshing metadata for once/rarely-touched topics, which isn't ideal. It'd be at most 2x the rate, but we can certainly do better. I made the default eviction rate to mirror the

Re: [DISCUSS] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

2020-01-02 Thread Guozhang Wang
Hello Brain, For the newly added `metadata.evict.ms` config, since its default value is set as the same as `metadata.max.age.ms` and most users would not override default config values, would it be possible for scenarios that producer send to a specific topic once and never send again, we may end

Re: [DISCUSS] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

2019-12-26 Thread Brian Byrne
Hi Stanislav, Appreciate the feedback! 1. You're correct. I've added notes to the KIP to clarify. 2. Yes it should. Fixed. 3. So I made a mistake when generalizing the target refresh size, which should have been using `metadata.max.age.ms` instead of `metadata.evict.ms`. Therefore,

Re: [DISCUSS] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

2019-12-26 Thread Stanislav Kozlovski
Hey Brian, 1. Could we more explicitly clarify the behavior of the algorithm when `|T| > TARGET_METADATA_FETCH SIZE` ? I assume we ignore the config in that scenario 2. Should `targetMetadataFetchSize = Math.max(topicsPerSec / 10, 20)` be `topicsPerSec * 10` ? 3. When is this new algorithm

Re: [DISCUSS] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

2019-12-19 Thread Brian Byrne
Hello everyone, For all interested, please take a look at the proposed algorithm as I'd like to get more feedback. I'll call for a vote once the break is over. Thanks, Brian On Mon, Dec 9, 2019 at 10:18 PM Guozhang Wang wrote: > Sounds good, I agree that should not make a big difference in

Re: [DISCUSS] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

2019-12-09 Thread Guozhang Wang
Sounds good, I agree that should not make a big difference in practice. On Mon, Dec 9, 2019 at 2:07 PM Brian Byrne wrote: > Hi Guozhang, > > I see, we agree on the topic threshold not applying to urgent topics, but > differ slightly on what should be considered urgent. I would argue that we >

Re: [DISCUSS] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

2019-12-09 Thread Brian Byrne
Hi Guozhang, I see, we agree on the topic threshold not applying to urgent topics, but differ slightly on what should be considered urgent. I would argue that we should consider topics nearing the metadata.max.age.ms to be urgent since they may still be well within the metadata.expiry.ms. That

Re: [DISCUSS] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

2019-12-09 Thread Guozhang Wang
Hello Brian, Thanks for your explanation, could you then update the wiki page for the algorithm part since when I read it, I thought it was different from the above, e.g. urgent topics should not be added just because of max.age expiration, but should only be added if there are sending data

Re: [DISCUSS] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

2019-12-09 Thread Brian Byrne
Hi Guozhang, Thanks for the feedback! On Sun, Dec 8, 2019 at 6:25 PM Guozhang Wang wrote: > 1. The addition of *metadata.expiry.ms *should > be included in the public interface. Also its semantics needs more > clarification (since previously it is hard-coded

Re: [DISCUSS] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

2019-12-08 Thread Guozhang Wang
Hello Brian, Thanks for the updated PR and sorry for the late reply. I reviewed the page again and here are some more comments: Minor: 1. The addition of *metadata.expiry.ms *should be included in the public interface. Also its semantics needs more clarification

Re: [DISCUSS] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

2019-11-20 Thread deng ziming
I think it's ok, and you can add another issue about `asynchronous metadata` if `topic expiry` is not enough. On Thu, Nov 21, 2019 at 6:20 AM Brian Byrne wrote: > Hello all, > > I've refactored the KIP to remove implementing asynchronous metadata > fetching in the producer during send(). It's

Re: [DISCUSS] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

2019-11-20 Thread Brian Byrne
Hello all, I've refactored the KIP to remove implementing asynchronous metadata fetching in the producer during send(). It's now exclusively focused on reducing the topic metadata fetch payload and proposes adding a new configuration flag to control topic expiry behavior. Please take a look when

Re: [DISCUSS] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

2019-10-04 Thread Brian Byrne
Lucas, Guozhang, Thank you for the comments. Good point on METADATA_MAX_AGE_CONFIG - it looks like the ProducerMetadata was differentiating between expiry and refresh, but it should be unnecessary to do so once the cost of fetching a single topic's metadata is reduced. I've updated the rejected

Re: [DISCUSS] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

2019-10-04 Thread Guozhang Wang
Hello Brian, Thanks for the KIP. I think using asynchronous metadata update to address 1) metadata update blocking send, but for other issues, currently at producer we do have a configurable `METADATA_MAX_AGE_CONFIG` similar to consumer, by default is 5min. So maybe we do not need to introduce

Re: [DISCUSS] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

2019-10-03 Thread Lucas Bradstreet
Hi Brian, This looks great, and should help reduce blocking and high metadata request volumes when the producer is sending to large numbers of topics, especially at low volumes. I think the approach to make metadata fetching asynchronous and batch metadata requests together will help

Re: [DISCUSS] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

2019-09-20 Thread Brian Byrne
I've updated the 'Proposed Changes' to include two new producer configuration variables: topic.expiry.ms and topic.refresh.ms. Please take a look. Thanks, Brian On Tue, Sep 17, 2019 at 12:59 PM Brian Byrne wrote: > Dev team, > > Requesting discussion for improvement to the producer when

[DISCUSS] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

2019-09-17 Thread Brian Byrne
Dev team, Requesting discussion for improvement to the producer when dealing with a large number of topics. KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-526%3A+Reduce+Producer+Metadata+Lookups+for+Large+Number+of+Topics JIRA: https://issues.apache.org/jira/browse/KAFKA-8904