I'm working on a script to track contributors so that (A) we can track
project health for ASF board report purposes and (B) we can possibly
share a nice "Thank you" listing contributors in release
announcements. Other purposes might crop up. GitHub's contributors
report has serious shortcomings[1] so I'm not using that.
So far I have something like this:
git log main --since="3 months ago" --pretty="Author: %an <%ae>%n%B"
| awk -F': ' '/^(Author|Co-authored-by): / {print $2}' | sort | uniq
-c
But needs deduplication because most people have multiple entries.
With the complexity of deduplication, I'd convert this to Python and
put in dev-tools/scripts and create a "contributors.txt" file
somewhere that contains a full name, primary email, and email aliases.
I'm sure it's debatable to go this route vs CHANGES.txt but the latter
is harder to parse and ... I dunno; I don't like that it's so custom
compared to a generic Git metadata approach. But maybe the dedupe
wouldn't be necessary (just fix CHANGES.txt for dups), and wouldn't
include trivial edits (for better/worse). CHANGES.txt would be more
accurate for version-specific contribution attribution (since
CHANGES.txt is organized this way but harder to do between arbitrary
commits/dates.
[1]
https://docs.github.com/en/repositories/viewing-activity-and-data-for-your-repository/viewing-a-projects-contributors#troubleshooting-contributors
~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]