Hi,

We have an interesting problem to solve due to a very large traffic volumes
on particular topics. In our initial system configuration we had only one
partition per topic, and in in a couple of topics we have built up huge
backlogs of several million messages that our consumers are slowly
processing.

However, now that we have this constant backlog, we wish to repartition
those topics into several partitions, and allow parallel consumers to run
to handle the high message volume.

If we simply repartition the topic, say from 1 to 4 partitions, the
backlogged messages stay in partition 1, while partitions 2,3,4 only get
newly arrived messages. To eat away the backlog, we need to redistribute
the backlogged messages evenly among the 4 partitions.

The tools I've seen do not allow me to rewrite or "replay" the existing
backlogged messages from one partition into the same or another topic with
several partitions.  - using kafka.tools.MirrorMaker does not allow me to
move the data within the same zookeeper network, and
 - using kafka.tools.ReplayLogProducer does not write to multiple
partitions. It seems that it will write only from a single partition to a
single partition.

Does anyone have any other way to solve this problem or a better way of
using the kafka tools?

Thanks
Dennis

Reply via email to