When we need to delete a group, we do it in Zookeeper directly. When we need to roll back offsets, we use the Import/Export tool classes to do it, because it's a little more efficient than working in Zookeeper. You can find the details on the tools at https://cwiki.apache.org/confluence/display/KAFKA/System+Tools
I believe that as of right now (and please someone correct me if I'm wrong), the Import/Export tools are not available yet for Kafka-committed offsets. There is a patch for this, and we have a version of it built internally. I think it's waiting for some of the KIP work before it is finalized. -Todd On Wed, Jun 10, 2015 at 4:48 PM, James Cheng <jch...@tivo.com> wrote: > > > On Jun 10, 2015, at 1:26 PM, Todd Palino <tpal...@gmail.com> wrote: > > > > For us, group ID is a configuration parameter of the application. So we > > store it in configuration files (generally on disk) and maintain it there > > through our configuration and deployment infrastructure. As you pointed > > out, hard coding the group ID into the application is not usually a good > > pattern. > > > > If you want to reset, you have a couple choices. One is that you can just > > switch group names and start fresh. Another is that you can shut down the > > consumer and delete the existing consumer group, then restart. You could > > also stop, edit the offsets to set them to something specific (if you > need > > to roll back to a specific point, for example), and restart. > > > > Thanks Todd. That helps. The "on disk" storage doesn't work well if you > are running consumers in ephemeral nodes like EC2 machines, but in that > case, I guess you would save the group ID in some other data store ("on > disk, but elsewhere") associated with your "application cluster" rather > than any one node of the cluster. > > I often hear about people saving their offsets using the consumer, and > monitoring offsets for lag. I don't hear much about people deleting or > changing/setting offsets by other means. How is it usually done? Are there > tools to change the offsets, or do people go into zookeeper to change them > directly? Or, for broker-stored offsets, use the Kafka APIs? > > -James > > > -Todd > > > > > > On Wed, Jun 10, 2015 at 1:20 PM, James Cheng <jch...@tivo.com> wrote: > > > >> Hi, > >> > >> How are people specifying/persisting/resetting the consumer group > >> identifier ("group.id") when using the high-level consumer? > >> > >> I understand how it works. I specify some string and all consumers that > >> use that same string will help consume a topic. The partitions will be > >> distributed amongst them for consumption. And when they save their > offsets, > >> the offsets will be saved according to the consumer group. That all > makes > >> sense to me. > >> > >> What I don't understand is the best way to set and persist them, and > reset > >> them if needed. For example, do I simply hardcode the string in my > code? If > >> so, then all deployed instances will have the same value (that's good). > If > >> I want to bring up a test instance of that code, or a new installation, > >> though, then it will also share the load (that's bad). > >> > >> If I pass in a value to my instances, that lets me have different test > and > >> production instances of the same code (that's good), but then I have to > >> persist my consumer group id somewhere outside of the process (on disk, > in > >> zookeeper, etc). Which then means I need some way to manage *that* > >> identifier (that's... just how it is?). > >> > >> What if I decide that I want my app to start over? In the case of > >> log-compacted streams, I want to throw away any processing I did and > start > >> "from the beginning". Do I change my consumer group, which effective > resets > >> everything? Or do I delete my saved offsets, and then resume with the > same > >> consumer group? The latter is functionally equivalent to the former. > >> > >> Thanks, > >> -James > >> > >> > >