[
https://issues.apache.org/jira/browse/KAFKA-559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tejas Patil updated KAFKA-559:
------------------------------
Attachment: KAFKA-559.v2.patch
_1. Passing a groupId for cleanup will make the cleanup job tedious since we
tend to have hundreds of console-consumer group ids in ZK that are stale.
Running the tool for a particular topic or all topics probably makes more
sense._
I had received different requirement spec which was based on "group" and not
"topic". In the new patch, I have added support for topic based deletion too.
_2. I would suggest accepting a date param "mm-dd-yyyy hh:mm:ss,SSS" as a
String instead of accepting a timestamp value, and deleting the group only if
it has had no updates to its offsets since that date, as described above._
I had a discussion with Joel about this one and he had suggested me to use the
EPOCH time thing instead of "mm-dd-yyyy". I am open for modification but if
there is a consensus about it.
_3. It's dangerous to delete the entire group if the date/"since" is not
provided. It's very easy for user to specify only two arguments (topic and
zkconnect) and not specify the date. Let's also make sure that the user always
specifies a date._
Ok. Change implemented
_4. "dry-run" does not need to accept any value. You can simply use
parser.accepts("dry-run", "....") and then use if (options.has(dryRunOpt))
....._
Agreed. Change implemented
_5. We can inline exitIfNoPathExists, the implementation is small and clear
enough._
While adding support for topic based deletion, that method had no use so got
rid of it.
_6. We should have an info statement when the group ids are deleted in the non
dry-run mode._
Good catch :) Change implemented
_7. info("Removal has successfully completed.") can probably be refactored to
something more specific to this tool._
Below is what I changed it to:
logger.info("Kafka obsolete Zk entires cleanup tool shutdown successfully.")
_8. Instead of writing a different info statement for dry-run mode, I think you
should be able to set logIdent of Logging to "[dry-run]" or "" depending on
which mode the tool is working in._
Nice one :) Change implemented
Minor stuff:
_1. I think we tend to use camelCase in variable names instead of underscores._
_2. Whitespaces can be made more consistent._
I just read http://kafka.apache.org/coding-guide.html. Will take care of that
from now on.
One related question: is there a code formatter for Kafka ?
Creating a fat wiki page with a bunch of project specific formatting is helpful
but people won't have that in mind everytime they code.
It gets worse when people work on multiple projects which follow different
conventions. Here is what some other projects came up with to account for that:
- https://issues.apache.org/jira/browse/HBASE-3678
-
http://svn.apache.org/viewvc/nutch/branches/2.x/eclipse-codeformat.xml?view=markup
- https://github.com/cloudera/blog-eclipse/blob/master/hadoop-format.xml
Attached KAFKA-559.v2.patch : Implemented the changes suggested by
[~swapnilghike] and did some sundry refactoring
> Garbage collect old consumer metadata entries
> ---------------------------------------------
>
> Key: KAFKA-559
> URL: https://issues.apache.org/jira/browse/KAFKA-559
> Project: Kafka
> Issue Type: New Feature
> Reporter: Jay Kreps
> Assignee: Tejas Patil
> Labels: project
> Attachments: KAFKA-559.v1.patch, KAFKA-559.v2.patch
>
>
> Many use cases involve tranient consumers. These consumers create entries
> under their consumer group in zk and maintain offsets there as well. There is
> currently no way to delete these entries. It would be good to have a tool
> that did something like
> bin/delete-obsolete-consumer-groups.sh [--topic t1] --since [date]
> --zookeeper [zk_connect]
> This would scan through consumer group entries and delete any that had no
> offset update since the given date.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira