I’m sorry. I misread your message. I thought you were asking about increasing the number of partitions on a topic after there were keyed events in it.
> On Nov 22, 2021, at 3:07 AM, Pushkar Deole <[email protected]> wrote: > > Dave, > > i am not sure i get your point... it is not about lesser partitions, the > issue is about the duplicate hash caused by default partitioner for 2 > different string, which might be landing the 2 different keys into same > partition > >> On Sun, Nov 21, 2021 at 9:33 PM Dave Klein <[email protected]> wrote: >> >> Another possibility, if you can pause processing, is to create a new topic >> with the higher number of partitions, then consume from the beginning of >> the old topic and produce to the new one. Then continue processing as >> normal and all events will be in the correct partitions. >> >> Regards, >> Dave >> >>>> On Nov 21, 2021, at 7:38 AM, Pushkar Deole <[email protected]> wrote: >>> >>> Thanks Luke, I am sure this problem would have been faced by many others >>> before so would like to know if there are any existing custom algorithms >>> that can be reused, >>> >>> Note that we also have requirement to maintain key level ordering, so >> the >>> custom partitioner should support that as well >>> >>>> On Sun, Nov 21, 2021, 18:29 Luke Chen <[email protected]> wrote: >>>> >>>> Hello Pushkar, >>>> Default distribution algorithm is by "hash(key) % partition_count", so >>>> there's possibility to have the uneven distribution you saw. >>>> >>>> Yes, there's a way to solve your problem: custom partitioner: >>>> >> https://kafka.apache.org/documentation/#producerconfigs_partitioner.class >>>> >>>> You can check the partitioner javadoc here >>>> < >>>> >> https://kafka.apache.org/30/javadoc/org/apache/kafka/clients/producer/Partitioner.html >>>>> >>>> for reference. You can see some examples from built-in partitioners, ex: >>>> >>>> >> clients/src/main/java/org/apache/kafka/clients/producer/internals/DefaultPartitioner.java. >>>> Basically, you want to focus on the "partition" method, to define your >> own >>>> algorithm to distribute the keys based on the events, ex: key-1 -> >>>> partition-1, key-2 -> partition-2... etc. >>>> >>>> Thank you. >>>> Luke >>>> >>>> >>>> On Sat, Nov 20, 2021 at 2:55 PM Pushkar Deole <[email protected]> >>>> wrote: >>>> >>>>> Hi All, >>>>> >>>>> We are experiencing some uneven distribution of events across topic >>>>> partitions for a small set of unique keys: following are the details: >>>>> >>>>> 1. topic with 6 partitions >>>>> 2. 8 unique keys used to produce events onto the topic >>>>> >>>>> Used 'key' based partitioning while producing events onto the above >> topic >>>>> Observation: only 3 partitions were utilized for all the events >>>> pertaining >>>>> to those 8 unique keys. >>>>> >>>>> Any idea how can the load be even across partitions while using key >> based >>>>> partitioning strategy? Any help would be greatly appreciated. >>>>> >>>>> Note: we cannot use round robin since key level ordering matters for us >>>>> >>>> >> >>
