Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-05-07 Thread Justine Olshan
Hi Omnia, Thanks for the detailed response. I agree that the client ID solution can be tricky (and could even run into the same problem if the client ID is not unique). As for the waiting one day -- that was not meant to be an exact value, but my point was that there will be some time where

Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-05-07 Thread Omnia Ibrahim
Hi Justine Thanks for the feedback > So consider a case where there is a storm for a given principal. We could > have a large mass of short lived producers in addition to some > "well-behaved" ones. My understanding is that if the "well-behaved" one > doesn't produce as frequently ie less than

Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-05-07 Thread Omnia Ibrahim
Hi Igor, thanks for the feedback and sorry for the late response. > 10 Given the goal is to prevent OOMs, do we also need to > limit the number of KafkaPrincipals in use? None of the Kafka quotas ever limited number of KafkaPrincipals and I don’t really think this is the issue as you just need

Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-05-06 Thread Justine Olshan
Hi Claude, I can clarify my comments. Just to clarify -- my understanding is that we don't intend to throttle any new producer IDs at the beginning. I believe this amount is specified by `producer_ids_rate`, but you can see this as a number of producer IDs per hour. So consider a case where

Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-05-05 Thread Claude Warren
Justine, I am new here so please excuse the ignorance. When you talk about "seen" producers I assume you mean the PIDs that the Bloom filter has seen. When you say "producer produces every 2 hours" are you the producer writes to a topic every 2 hours and uses the same PID? When you say "hitting

Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-05-03 Thread Justine Olshan
Hey folks, I shared this with Omnia offline: One concern I have is with the length of time we keep "seen" producer IDs. It seems like the default is 1 hour. If a producer produces every 2 hours or so, and we are hitting the limit, it seems like we will throttle it even though we've seen it before

Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-05-02 Thread Claude Warren, Jr
There is some question about whether or not we need the configuration options. My take on them is as follows: producer.id.quota.window.num No opinion. I don't know what this is used for, but I suspect that there is a good reason to have it. It is not used within the Bloom filter caching

Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-05-02 Thread Claude Warren, Jr
Quick note: I renamed the example code. It is now at https://github.com/Claudenw/kafka/blob/KIP-936/storage/src/main/java/org/apache/kafka/storage/internals/log/ProducerIDQuotaManagerCache.java On Thu, May 2, 2024 at 10:47 AM Claude Warren, Jr wrote: > Igor, thanks for taking the time to

Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-05-02 Thread Claude Warren, Jr
Igor, thanks for taking the time to look and to review the code. I regret that I have not pushed the latest code, but I will do so and will see what I can do about answering your Bloom filter related questions here. How would an operator know or decide to change the configuration > for the

Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-05-01 Thread Igor Soarez
Hi Omnia, Hi Claude, Thanks for putting this KIP together. This is an important unresolved issue in Kafka, which I have witnessed several times in production. Please see my questions below: 10 Given the goal is to prevent OOMs, do we also need to limit the number of KafkaPrincipals in use? 11.

Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-04-30 Thread Omnia Ibrahim
Hi, Just bringing some offline discussion and recent updated to the KIP here to the mailing list Claude updated the KIP to use LayeredBloomFilter from Apache-commons https://commons.apache.org/proper/commons-collections/apidocs/org/apache/commons/collections4/bloomfilter/LayeredBloomFilter.html

Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-04-24 Thread Omnia Ibrahim
Hi Glaude sorry that it took me a while to respond. I finally had time to look into your implementation here https://github.com/Claudenw/kafka/blob/KIP-936/storage/src/main/java/org/apache/kafka/storage/internals/log/ProducerIDQuotaManager.java#L121 and so far it make sense. > So an early PID

Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-04-16 Thread Claude Warren
The difference between p.i.q.window.count and p.i.q.window.num: To be honest, I may have misunderstood your definition of window num. But here is what I have in mind: 1. p.i.q.window.size.seconds the length of time that a window will exist. This is also the maximum time between PID uses

Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-04-16 Thread Claude Warren
Let's put aside the CPC datasketch idea and just discuss the Bloom filter approach. I thinkthe problem with the way the KIP is worded is that PIDs are only added if they are not seen in either of the Bloom filters. So an early PID is added to the first filter and the associated metric is

Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-04-15 Thread Omnia Ibrahim
Hi Claude, Thanks for the implementation of the LayeredBloomFilter in apache commons. > Define a new configuration option "producer.id.quota.window.count" as > the number of windows active in window.size.seconds. What is the different between “producer.id.quota.window.count” and

Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-04-15 Thread Claude Warren
After thinking about his KIP over the weekend I think that there is another lighter weight approach. I think the question is not whether or not we have seen a given PID before but rather how many unique PIDs did the principal create in the last hour. Perhaps more exactly it is: did the Principal

Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-04-12 Thread Claude Warren
I think there is an issue in the KIP. Basically the kip says, if the PID is found in either of the Bloom filters then no action is taken If the PID is not found then it is added and the quota rating metrics are incremented. In this case long running PIDs will be counted multiple times. Let's

Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-04-12 Thread Claude Warren
Initial code is available at https://github.com/Claudenw/kafka/blob/KIP-936/storage/src/main/java/org/apache/kafka/storage/internals/log/ProducerIDQuotaManager.java On Tue, Apr 9, 2024 at 2:37 PM Claude Warren wrote: > I should also note that the probability of false positives does not fall >

Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-04-09 Thread Claude Warren
I should also note that the probability of false positives does not fall below shape.P because as it approaches shape.P a new layer is created and filters are added to that. So no layer in the LayeredBloomFilter exceeds shape.P thus the entire filter does not exceed shape.P. Claude On Tue, Apr

[DISCUSS] KIP-936 Throttle number of active PIDs

2024-04-09 Thread Claude Warren
The overall design for KIP-936 seems sound to me. I would make the following changes: Replace the "TimedBloomFilter" with a "LayeredBloomFilter" from commons-collections v4.5 Define the producer.id.quota.window.size.seconds to be the length of time that a Bloom filter of PIDs will exist. Define

Re: [DISCUSS] KIP-936: Throttle number of active PIDs

2023-11-02 Thread Claude Warren
I don't know why I missed this message. You don't have to update the max entries for the shape. Set the max entries to be the highest quota. Then you can use the BloomFilter.estimateN() method to determine how many PIDs have been inserted into the filter. On Wed, Aug 30, 2023 at 1:19 PM Omnia

Re: [DISCUSS] KIP-936: Throttle number of active PIDs

2023-08-30 Thread Omnia Ibrahim
Hi Claude, sorry for the late reply was out for some time. Thanks for your response. > - To ensure that all produced ids are tracked for 1 hour regardless of > whether they were produced by userA or userB. Not really we need to track producer ids created by userA separately from producer ids

Re: [DISCUSS] KIP-936: Throttle number of active PIDs

2023-08-21 Thread Claude Warren
I misspoke before the LayedBloomFilterTest.testExpiration() uses milliseconds to expire the data but it layout an example of how to expire filters in time intervals. On Fri, Aug 18, 2023 at 4:01 PM Claude Warren wrote: > Sorry for taking so long to get back to you, somehow I missed your

Re: [DISCUSS] KIP-936: Throttle number of active PIDs

2023-08-18 Thread Claude Warren
Sorry for taking so long to get back to you, somehow I missed your message. I am not sure how this will work when we have different producer-id-rate > for different KafkaPrincipal as proposed in the KIP. > For example `userA` had producer-id-rate of 1000 per hour while `user2` has > a quota of

Re: [DISCUSS] KIP-936: Throttle number of active PIDs

2023-07-16 Thread Omnia Ibrahim
Thanks Claude for the feedback and the raising this implementation to Apache commons-collections. I had a look into your layered bloom filter and at first glance, I think it would be a better improvement, however, regarding the following suggestion > By hashing the principal and PID into the

Re: [DISCUSS] KIP-936: Throttle number of active PIDs

2023-06-21 Thread Claude Warren
I think that the either using a Stable bloom filter or a Layered bloom filter constructed as follows: - Each layer is configured for the maximum number of principal-PID pairs expected in a single minute. - Expect 60 layers (one for each minute) - If the layer becomes fully populated

Re: [DISCUSS] KIP-936: Throttle number of active PIDs

2023-06-21 Thread Claude Warren
I have an implementation of a layered Bloom filter in [1] (note the layered branch). This should handle the layering Bloom filter and allow for layers that 1. Do not become over populated and thus yield too many false positives. 2. Expire and are removed automatically. The layered Bloom

Re: [DISCUSS] KIP-936: Throttle number of active PIDs

2023-06-18 Thread Omnia Ibrahim
Hi Haruki, Thanks for having a look at the KIP. > 1. Do you have any memory-footprint estimation for TimeControlledBloomFilter? I don't at the moment have any estimate as I don't have a full implementation of this one at the moment. I can work on one if it's required. > * If I read the KIP

Re: [DISCUSS] KIP-936: Throttle number of active PIDs

2023-06-08 Thread Claude Warren
The link I thought I included did not carry over in the last post. The paper can be found at: https://webdocs.cs.ualberta.ca/~drafiei/papers/DupDet06Sigmod.pdf On Thu, Jun 8, 2023 at 9:05 AM Claude Warren wrote: > > Have you considered using Stable Bloom Filters [1]. I think they do what >

Re: [DISCUSS] KIP-936: Throttle number of active PIDs

2023-06-08 Thread Claude Warren
Have you considered using Stable Bloom Filters [1]. I think they do what you want without a lot of the overhead you propose for your solution. In addition, you may want to look at Commons-Collections v4.5 [2] (currently snapshot) for efficient Bloom filter code. I have a Stable Bloom

Re: [DISCUSS] KIP-936: Throttle number of active PIDs

2023-06-06 Thread Haruki Okada
Hi, Omnia. Thanks for the KIP. The feature sounds indeed helpful and the strategy to use bloom-filter looks good. I have three questions: 1. Do you have any memory-footprint estimation for TimeControlledBloomFilter? * If I read the KIP correctly, TimeControlledBloomFilter will be allocated

[DISCUSS] KIP-936: Throttle number of active PIDs

2023-06-06 Thread Omnia Ibrahim
Hi everyone, I want to start the discussion of the KIP-936 to throttle the number of active PIDs per KafkaPrincipal. The proposal is here https://cwiki.apache.org/confluence/display/KAFKA/KIP-936%3A+Throttle+number+of+active+PIDs