Hi All,
I am hoping to get more involved in the upstream Kafka community. To that end,
I was trying to keep up with the KIPs that were currently under discussion.
However, I found it hard to keep track of what was and wasn't being discussed
and the progress they were making. Some KIPs appeared abandoned but will still
classed as "Under Discussion".
So, during a very rainy week on holiday, I created a tool (which I called
KIPper [[1](https://github.com/tomncooper/kipper)]) to parse the dev mailing
list archive and extract all KIP mentions. I paired this with information
parsed from the confluence (wiki) API to create an enriched table of the KIPs
Under Discussion [[2](https://tomncooper.github.io/kipper/)].
The table shows a "Status" for each KIP, which is based on the last time the
KIP was mentioned in the subject line of an email on the dev mailing list.
Green for within the last month, yellow for the last 3 months and red for
within the last year. If the status is black then it hasn't been mentioned in
over a year.
I also added vote information, but this is only indicative as it is based on
parsing the non-reply lines (without ">" in) of the email bodies so could hold
false positives.
In the spirit of the discussion on closing stale PRs
[[3](https://lists.apache.org/thread/66yj9m6tcyz8zqb3lqlbnr386bqwsopt)], it
might be a good idea to introduce a new KIP "state" alongside "Under
Discussion", "Accepted" and "Rejected" (and their numerous variants
[[4](https://github.com/tomncooper/kipper/blob/0bbb5595e79a9e075b0d2dc907c84693734d7846/kipper/wiki.py#L54)]).
Maybe a KIP with a black status and no votes could be moved to a "Stale" or
"Rejected" state?
The kipper page is statically generated at the moment so could be updated every
day with a cron job. The data used to create the page could also be used to
drive automation, perhaps emailing the KIPs author once a KIP hits "Red" status
and then automatically setting the state to stale once it turns "Black"?
Anyway, I learned a lot making the tool and I now feel I have a better handle
on the state of various KIPs. I hope others find it useful. There is loads of
information to be harvested from the mailing list and wiki APIs so if any one
has any feature requests please post issues on the GH page. I had one
suggestion of performing sentiment analysis on the email bodies related to each
KIP, to get a feel of how the KIP was being received. But maybe that is a step
too far..
Cheers,
[1] https://github.com/tomncooper/kipper
[2] https://tomncooper.github.io/kipper/
[3] https://lists.apache.org/thread/66yj9m6tcyz8zqb3lqlbnr386bqwsopt
[4]
https://github.com/tomncooper/kipper/blob/0bbb5595e79a9e075b0d2dc907c84693734d7846/kipper/wiki.py#L54
Tom Cooper
[@tomncooper](https://twitter.com/tomncooper) | https://tomcooper.dev