I’m using Kafka for what I think is a somewhat non-standard purpose.  We have 
multiple producers which send messages to a topic.  We have say as many as 500 
or 1000 consumers which each want to read every message posted to the topic, 
but *only* from the point in time that the consumer came alive and started 
listening.  (We’re using Kafka to replace a UDB based message system.)

Given the above, I set things up as:

        + topic has only 1 partition
        + retention time is very short (10 seconds)
        + every consumer uses the same group.id <http://group.id/>
        + consumers DO NOT commit offsets ever
        + new consumers set the offset to the “latest available” when they start
        + consumers don’t subscribe: they assign themselves to partition 0 of 
the topic

This works fine: each consumer appears to get a copy of all the (new) messages 
that come in, after they start consuming.

Question: Is using the same group.id <http://group.id/> for every consumer 
likely to wreck anything?  I could generate a random group.id 
<http://group.id/> so that it appears that every consumer is distinct, but 
given that each consumer does an assign to partition 0, and never commits 
offsets, is what I’m doing sound?

I’m slightly worried I’m misusing Kafka somehow: will this likely continue 
working in the future?
As I said, I’m slightly worried about reusing the same group.id 
<http://group.id/> each time.  But it seems bad to generate like 1,000 
different consumer records on the broker if there is no need…

Also, I want to make sure I don’t bog the brokers up with needless logfiles for 
messages that nobody will see.  (Once a message gets to be 10 seconds old, 
nobody should see it, because if you weren’t listening when it was produced, 
you don’t care about.)

Thanks for any insights/opinions.

        David Baraff
        d...@pixar.com <mailto:d...@pixar.com>


Reply via email to