Hello guys,

   I have a question using the createMessageStreamsByFilter API. The API
takes a regex and number of streams.
   In my use case, I have a regex that acts as filter to match topic names.
Each matching topic has 6 partitions. Now I want to achieve max parallelism
by having each thread read from one and only one partition of a topic. But
from what I read here:
http://grokbase.com/t/kafka/users/152agpgppv/createmessagestreams-vs-createmessagestreamsbyfilter
, it looks like createMessageStreamsByFilter API doesn't support this level
of parallelism. Because each stream returned by the API is reading from all
topics.
   Here I cite an example given in the above URL:
say you have topic AC: 3 partitions, topic BC: 6 partitions.
With createMessageStreamsByFilter("*C" => 3) a total of 3 threads will
be created,
and consuming AC-1/BC-1/BC-2, AC-2/BC-3/BC-4, AC-3/BC-5/BC-6 respectively.
So each stream/thread is reading from both topics. What I want is one
thread only reading from one partition.

   Does anyone know how I can achieve my purpose?

Thank you,
Chaoran Yu

Reply via email to