I think this is most likely due to https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyisdatanotevenlydistributedamongpartitionswhenapartitioningkeyisnotspecified?. Try reducing topic.metadata.refresh.interval.ms to something low like 2000.
Thanks, Neha On Fri, Jun 20, 2014 at 2:15 PM, Luke Forehand < [email protected]> wrote: > We are upgrading to kafka 0.8.1.1 from 0.8-beta > > My first task was to start a stream of messages into a topic, using a 4 > node cluster. The topic has 10 partitions and 3 replicas. > > I ran the following to produce messages to the topic: > > socat - TCP-LISTEN:10000 | ./kafka-console-producer.sh --batch-size 10000 > --broker-list kafka10:9092,kafka11:9092,kafka12:9092,kafka13:9092 > --compress --request-required-acks 1 --topic luke1 & > > And then ran a loop to produce messages via telnet: > > while true; do echo "blah blah blah"; done | telnet 127.0.0.1 10000 > > I verified via the console consumer that messages were being received. > What was strange is that after some time I quit the program and checked > the partition offsets: > > ./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > kafka10:9092,kafka11:9092,kafka12:9092,kafka13:9092 --topic luke1 --time > -1 > > luke1:0:1853139 > luke1:1:9 > luke1:2:1 > luke1:3:3 > luke1:4:50 > luke1:5:266603 > luke1:6:80035 > luke1:7:3455509 > luke1:8:3756164 > luke1:9:5 > > > They are very uneven, any ideas what is going on here? > > Thanks, > Luke Forehand | Networked Insights | Software Engineer > > >
