I have following problem, I tried almost everything I could but without any luck
All I want to do is to have 1 producer, 1 topic, 10 partitions and 10 consumers. All I want is to send 1M of messages via producer to these 10 consumers. I am using built Kafka 0.8.3 from current upstream so I have bleeding edge stuff. It does not work on 0.8.1.1 nor 0.8.2 stream. The problem I have is that I expect that when I send 1 milion of messages via that producer, I will have all consumers busy. In other words, if a message to be sent via producer is sent to partition randomly (roundrobin / range), I expect that all 10 consumers will process about 100k of messages each because producer sends it to random partition of these 10. But I have never achieved such outcome. I was trying these combinations: 1) old scala producer vs old scala consumer Consumer was created by Consumers.createJavaConsumer() ten times. Every consumer is running in the separate thread. 2) old scala producer vs new java consumer new consumer was used like I have 10 consumers listening for a topic and 10 consumers subscribed to 1 partition. (consumer 1 - partition 1, consumer 2 - paritition 2 and so on) 3) old scala producer with custom partitioner I even tried to use my own partitioner, I just generated a random number from 0 to 9 so I expected that the messages will be sent randomly to the partition of that number. All I see is that there are only couple of consumers from these 10 utilized, even I am sending 1M of messages, all I got from the debugging output is some preselected set of consumers which appear to be selected randomly. Do you have ANY hint why all consumers are not utilized even partitions are selected randomly? My initial suspicion was that rebalancing was done badly. The think was I was generating old consumers in a loop quicky one after another and I can imaging that rebalancing algorithm got mad. So I abandon this solution and I was thinking that let's just subscribe these consumers one by one to some partition so I will have 1 consumer subscribed just to 1 partition and there will not be any rebalancing at all. Oh my how wrong was I ... nothing changed. So I was thinking that if I have 10 consumers, each one subscribed to 1 paritition, maybe producer is just sending messages to some set of partitions and that's it. I was not sure how this can be possible so to be super sure about the even spreading of message to partitions, I used custom partitioner class in old consumer so I will be sure that the partition the message will be sent to is super random. But that does not seems to work either. Please people, help me. -- Stefan Miklosovic