Bhavesh, take a look at https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyisdatanotevenlydistributedamongpartitionswhenapartitioningkeyisnotspecified ?
Maybe the root cause issue is something else? Even if producers produce more or less than what they are producing you should be able to make it random enough with a partitioner and a key. I don't think you should need more than what is in the FAQ but incase so maybe look into http://en.wikipedia.org/wiki/MurmurHash as another hash option. /******************************************* Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop> ********************************************/ On Mon, Aug 4, 2014 at 9:12 PM, Bhavesh Mistry <mistry.p.bhav...@gmail.com> wrote: > How to achieve uniform distribution of non-keyed messages per topic across > all partitions? > > We have tried to do this uniform distribution across partition using custom > partitioning from each producer instance using round robing ( > count(messages) % number of partition for topic). This strategy results in > very poor performance. So we have switched back to random stickiness that > Kafka provide out of box per some interval ( 10 minutes not sure exactly ) > per topic. > > The above strategy results in consumer side lags sometime for some > partitions because we have some applications/producers producing more > messages for same topic than other servers. > > Can Kafka provide out of box uniform distribution by using coordination > among all producers and rely on measure rate such as # messages per minute > or # of bytes produce per minute to achieve uniform distribution and > coordinate stickiness of partition among hundreds of producers for same > topic ? > > Thanks, > > Bhavesh >