Hi, I would appreciate if anyone could explain the reason behind the following behaviour.
I’m running a topology on a Storm cluster consisting of a nimbus and two workers nodes. The topology is comprised of a KafkaSpout reading messages from a Kafka topic having 8 partitions, and a KafkaBolt writing the same messages back to another Kafka topic having 8 partitions as well. The data is shuffled between the spout and bolt using shuffle grouping. Parallelism hint is set to 8 for both the spout and bolt. The semantics of the topology shouldn’t be as that important, but I’ll explain if required why no additional processing takes place. When running the GetOffsetShell class on the Kafka cluster, in order to determine the number of messages per partition of the output topic, I see the following: bytes-1496949913:2:0 bytes-1496949913:5:0 bytes-1496949913:4:0 bytes-1496949913:7:0 bytes-1496949913:1:99999992 bytes-1496949913:3:0 bytes-1496949913:6:0 bytes-1496949913:0:0 As depicted above, the second partition of the topic has all of the messages creating a quite strange imbalance. What could the reason behind this be? Thanks in advance, Dominik
