Hi,

How are people specifying/persisting/resetting the consumer group identifier 
("group.id") when using the high-level consumer?

I understand how it works. I specify some string and all consumers that use 
that same string will help consume a topic. The partitions will be distributed 
amongst them for consumption. And when they save their offsets, the offsets 
will be saved according to the consumer group. That all makes sense to me.

What I don't understand is the best way to set and persist them, and reset them 
if needed. For example, do I simply hardcode the string in my code? If so, then 
all deployed instances will have the same value (that's good). If I want to 
bring up a test instance of that code, or a new installation, though, then it 
will also share the load (that's bad). 

If I pass in a value to my instances, that lets me have different test and 
production instances of the same code (that's good), but then I have to persist 
my consumer group id somewhere outside of the process (on disk, in zookeeper, 
etc). Which then means I need some way to manage *that* identifier (that's... 
just how it is?).

What if I decide that I want my app to start over? In the case of log-compacted 
streams, I want to throw away any processing I did and start "from the 
beginning". Do I change my consumer group, which effective resets everything? 
Or do I delete my saved offsets, and then resume with the same consumer group? 
The latter is functionally equivalent to the former.

Thanks,
-James

Reply via email to