> On Oct 2, 2025, at 10:45 AM, sebb <[email protected]> wrote: > > On Thu, 2 Oct 2025 at 15:30, Rich Bowen <[email protected] > <mailto:[email protected]>> wrote: >> >> Hi, folks, >> >> For the last couple of months I’ve been producing these - >> https://boxofclue.com/apache-highlights/ >> >> The process is that I have a checkout of every repo under >> https://github.com/apache (just metadata, not actual files. 2.9Gb total) and >> I grind through them to generate some metrics like: >> >> First time commit (ie, had a commit in a merged PR the first time) >> 10th/100th/1000th/etc commit >> >> There are some false positives, which I think come from, for example, X >> makes a first-time commit to iceberg-fortran but they’ve contributed to >> iceberg-rust before. But for the most part, it gives a really great weekly >> snapshot of who the new people in your project are. I’ve gotten a couple >> positive comments from a handful of projects that are using this data to >> welcome new contributors, which was the intent of the thing. (I post it to >> Mastodon every week.) >> >> I’d like to run this on our VM, rather than running it on my laptop every >> Monday morning. I’d also like to link to the reports from a couple of places >> on our website (and don’t really want to link to boxofclue.com >> <http://boxofclue.com/> <http://boxofclue.com/> from there!) But I wanted to >> run it by you folks first, before taking the liberty to do that. Does >> anybody have any objections to me doing this? > > Which VM did you have in mind?
projects.apache.org <http://projects.apache.org/> seems to make the most sense, since this is project metrics. Also, I suppose a related question is, do you think anyone would have any objection to their name being listed on such a document on an Apache website? I cannot personally think why they would (and this is all already-public data) but I suppose it is possible that someone might, and I want to be sensitive to that. — Rich Bowen [email protected]
