take a look at kafka client https://github.com/gerritjvv/kafka-fast, it
uses a different approach where you can have more than several consumers
per topic+partition (i.e no relation between topic partitions and
consumers). It uses redis but only for offsets and work distribution, not
for the messages itself.

On Fri, Sep 30, 2016 at 4:07 PM, craig w <codecr...@gmail.com> wrote:

> I have a scenario where, for a given topic I'll have 500 consumers (1
> consumer per instance of an app). I've setup the topic so it has 500
> partitions, thus ensuring each consumer will eventually get work (the data
> produced into kafka  use the default partitioning strategy).
>
> note: These consumer app instances are run in containers via Marathon
> (using Mesos).
>
> Several times a day the consumer apps can be intentionally restarted (to
> upgrade the app, etc). When a rolling restart occurs, Kafka begins its
> rebalancing process. This process can take 10 minutes or so as the rolling
> restart itself takes a few minutes. As a result, what I've seen is that a
> consumer will have its partitions reassigned, consume a new message, start
> working on it, and then a reassignment occurs again. The work being
> performed when a message is received is effectively lost since messages
> being processed take 30s - 2 hours to process, and a re-assignment occurs.
>
> One suggestion from someone was to create a separate "app" in marathon for
> each instance, therefore I'd have 500 apps in marathon, and assign each one
> a specific partition number instead of letting Kafka assign partitions
> automatically to the consumers. This is problematic because I need to be
> able to increase/decrease the number of instances of the app based on
> demand coming into the system.
>
> To work around this, we have a custom component that consumes kafka topics
> and puts messages into redis lists (one per kafka topic). Then our
> consumers are doing a BLPOP (blocking pop operation) to ensure the message
> is only processed once, but also helps avoid rebalancing in kafka when the
> consumer apps are restarted.
>
> I'm considering using a different queueing system such as ActiveMQ,
> RabbitMQ...to avoid this kafka to redis scenario. Is kafka the right fit?
> Is there a better approach to doing this with kafka?
>
> Thanks in advance,
> Craig
>

Reply via email to