I have following problem, I tried almost everything I could but without any luck

All I want to do is to have 1 producer, 1 topic, 10 partitions and 10 consumers.

All I want is to send 1M of messages via producer to these 10 consumers.

I am using built Kafka 0.8.3 from current upstream so I have bleeding
edge stuff. It does not work on 0.8.1.1 nor 0.8.2 stream.

The problem I have is that I expect that when I send 1 milion of
messages via that producer, I will have all consumers busy. In other
words, if a message to be sent via producer is sent to partition
randomly (roundrobin / range), I expect that all 10 consumers will
process about 100k of messages each because producer sends it to
random partition of these 10.

But I have never achieved such outcome.

I was trying these combinations:

1) old scala producer vs old scala consumer

Consumer was created by Consumers.createJavaConsumer() ten times.
Every consumer is running in the separate thread.

2) old scala producer vs new java consumer

new consumer was used like I have 10 consumers listening for a topic
and 10 consumers subscribed to 1 partition. (consumer 1 - partition 1,
consumer 2 - paritition 2 and so on)

3) old scala producer with custom partitioner

I even tried to use my own partitioner, I just generated a random
number from 0 to 9 so I expected that the messages will be sent
randomly to the partition of that number.

All I see is that there are only couple of consumers from these 10
utilized, even I am sending 1M of messages, all I got from the
debugging output is some preselected set of consumers which appear to
be selected randomly.

Do you have ANY hint why all consumers are not utilized even
partitions are selected randomly?

My initial suspicion was that rebalancing was done badly. The think
was I was generating old consumers in a loop quicky one after another
and I can imaging that rebalancing algorithm got mad.

So I abandon this solution and I was thinking that let's just
subscribe these consumers one by one to some partition so I will have
1 consumer subscribed just to 1 partition and there will not be any
rebalancing at all.

Oh my how wrong was I ... nothing changed.

So I was thinking that if I have 10 consumers, each one subscribed to
1 paritition, maybe producer is just sending messages to some set of
partitions and that's it. I  was not sure how this can be possible so
to be super sure about the even spreading of message to partitions, I
used custom partitioner class in old consumer so I will be sure that
the partition the message will be sent to is super random.

But that does not seems to work either.

Please people, help me.

-- 
Stefan Miklosovic

Reply via email to