Hi, We have an interesting problem to solve due to a very large traffic volumes on particular topics. In our initial system configuration we had only one partition per topic, and in in a couple of topics we have built up huge backlogs of several million messages that our consumers are slowly processing.
However, now that we have this constant backlog, we wish to repartition those topics into several partitions, and allow parallel consumers to run to handle the high message volume. If we simply repartition the topic, say from 1 to 4 partitions, the backlogged messages stay in partition 1, while partitions 2,3,4 only get newly arrived messages. To eat away the backlog, we need to redistribute the backlogged messages evenly among the 4 partitions. The tools I've seen do not allow me to rewrite or "replay" the existing backlogged messages from one partition into the same or another topic with several partitions. - using kafka.tools.MirrorMaker does not allow me to move the data within the same zookeeper network, and - using kafka.tools.ReplayLogProducer does not write to multiple partitions. It seems that it will write only from a single partition to a single partition. Does anyone have any other way to solve this problem or a better way of using the kafka tools? Thanks Dennis