Hi Stephen, We've deprecated the partition-grouper API due to its drawbacks in upgrading compatibility (consider if you want to change the num.partitions while evolving your application), and instead we're working on KIP-221 for the same purpose of your use case:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-221%3A+Enhance+DSL+with+Connecting+Topic+Creation+and+Repartition+Hint Guozhang On Wed, Mar 18, 2020 at 7:48 AM Stephen Young <wintersg...@googlemail.com.invalid> wrote: > I have a question about partition assignment for a kafka streams app. As I > understand it the more complex your topology is the greater the number of > internal topics kafka streams will create. In my case the app has 8 graphs > in the topology. There are 6 partitions for each graph (this matches the > number of partitions of the input topic). So there are 48 partitions that > the app needs to handle. These get balanced equally across all 3 servers > where the app is running (each server also has 2 threads so there are 6 > available instances of the app). > > The problem for me is that the partitions of the input topic have the > heaviest workload. But these 6 partitions are not distributed evenly > amongst the instances. They are just considered 6 partitions amongst the 48 > the app needs to balance. But this means if a server gets most or all of > these 6 partitions, it ends up exhausting all of the resources on that > server. > > Is there a way of equally balancing these 6 specific partitions amongst the > available instances? I thought writing a custom partition grouper might > help here: > > > https://kafka.apache.org/10/documentation/streams/developer-guide/config-streams.html#partition-grouper > > But the advice seems to be to not do this otherwise you risk breaking the > app. > > Thanks! > -- -- Guozhang