I suspect each of the threads are not assigned equal number of messages to send. I don't think it matter whether you use one producer or more as long as you distribute work amongst those threads equally.
Thanks, Neha On Wednesday, April 17, 2013, Helin Xiang wrote: > Hi, > We are using kafka 0.7.2. > > The situation is a little complicated: > > 1. We use Java API and multi-thread to send logs to kafka. (like 16 > threads). Each thread contain its own kafka.javaapi.producer.Producer > object. > 2. There is one topic which the partition of is set to 4. we use random > partition to send. > 3. We generate messages of this topic at speed of 100 per second, so each > thread only gets several logs per seconds. > > But we find the 4 partition gets unbalanced data. partition 0 gets logs 10 > times more than partition 1 ,2 and 3. Partition 1 , 2 , 3 gets nearly > equal messages. > > after that, we set threads to 1, this unbalanced phenomenon vanished. > > we are not sure what happened under the java api of Producer. > Could any one explain it ? > Or is it necessary to generate new kafka.javaapi.producer.Producer object > in each thread? I hear the kafka.javaapi.producer.Producer class is thread > safe, but I don't know if 1 producer object can handle large throughput? > > > THANKS > > > -- > *Best Regards > > Xiang Helin* >