Hi All,

Using Kafka's high consumer API I have bumped into a situation where
launching a consumer process P1 with X consuming threads on a topic with X
partition kicks out all other existing consumer threads that consumed prior
to launching the process P.
That is, consumer process P is stealing all partitions from all other
consumer processes.

While understandable, it makes it hard to size & deploy a cluster with a
number of partitions that will both allow balancing of consumption across
consuming processes, dividing the partitions across consumers by setting
each consumer with it's share of the total number of partitions on the
consumed topic, and on the other hand provide room for growth and addition
of new consumers to help with increasing traffic into the cluster and the
topic.

This stealing effect forces me to have more partitions then really needed
at the moment, planning for future growth, or stick to what I need and
trust the option to add partitions which comes with a price in terms of
restarting consumers, bumping into out of order messages (hash
partitioning) etc.

Is this policy of stealing is intended, or did I just jump to conclusions?
what is the way to cope with the sizing question?

Shlomi

Reply via email to