On 10/27/2016 12:20 AM, sebb wrote: > These explanations of the what the stats mean need to be provided on > the page or linked from it.
Right, perhaps below/above each of them would be a good idea. I'll get working on that tomorrow. > > On 26 October 2016 at 22:12, Daniel Gruno <humbed...@apache.org> wrote: >> On 10/26/2016 10:56 PM, Phil Steitz wrote: >>> On 10/26/16 11:07 AM, Daniel Gruno wrote: >>>> I added an initial stats page at >>>> https://projects.apache.org/statistics.html - assuming no one objects, >>>> I'll add it to the top menu of the other pages in a day or so. >>>> >>>> Do peruse - anything we need to add/edit? >>> >>> Maven is not a programming language. What exactly is the >>> denominator on that stat? Number of files? Lines of code? >>> Projects primarily using? >> >> I suspect it's scripts specifically for maven it's counting. the >> denominator is lines of functional code (101 million in total, not >> counting blanks and comments which take us to 150M total). >> >>> >>> What does lines changed mean? It looks like lines changed is >>> somehow supposed to be insertions plus deletions. Where are the >>> mods to lines? Is this just counting -- and ++ out of diffs? That >>> is a very bad metric on how much code has actually changed or what a >>> contribution is. Formatting nits, creating RCs, etc generate huge >>> amounts of this stuff without really contributing much. >> >> AIUI, the huge ++/-- are weeded out in these charts, otherwise it would >> be in the millions of lines of code changed some days. We have, on >> average, 700-800 commits per business day to our repos, and with roughly >> 100k additions according to the chart, that would indicate an average of >> ~125 lines changed per commit. It's very possible that this includes >> some automatic changes, I can't say. As they are somewhat static, I am >> considering just scrapping that part, it probably doesn't show that much >> of value to us. >> >>> >>> What in the heck is an "author?" We eliminated @author tags years >>> ago because *we don't think like that* - lets not regress. If it >>> means someone created a new file, what is different about that than >>> just committing a patch of some kind? I would drop that metric or >>> just merge it into committers. >> >> An author in this context is someone who authored a piece of code, a >> committer is someone who committed the code to a repository. They need >> not be the same person. In Subversion, they are the same, as svn does >> not distinguish. In git, they are two different entities. Committers are >> always ASF committers, authors can be any contributor to a project with >> or without an apache account. >> >>> >>> I very much do not like the "leader board" concept, especially with >>> a bogus metric like number of diff lines generated driving it. I >>> would drop that thing. >> >> It's number of unique commits driving it, not number of diffs - that's a >> secondary statistic. While we disagree on liking this, I'll definitely >> take it under advisement as I work on the page. Note, it's not been made >> public in the sense that the front page links to it just yet, I'll do >> that once we are more aligned idea-wise. >> >>> >>> I would rather see "busiest" or "most active" projects defined by >>> something more meaningful like number of issues resolved or number >>> of releases. So change at least the first metric on the bottom to >>> number of issues resolved and maybe make the second one number of >>> releases. >> >> Number of releases would be nigh impossible, as we don't really keep >> score of that, at all. Issues solved could be done easily, though we >> don't have any formal mapping from issue tracker names back to our >> projects, so it would probably show which JIRA/BZ instances are the most >> active instead. >> >> With regards, >> Daniel. >> >>> >>> Phil >>> >>> >>> >>> >>>> >>>> With regards, >>>> Daniel. >>>> >>>> On 10/26/2016 01:07 PM, Daniel Gruno wrote: >>>>> Hi folks, >>>>> I was wondering, since we have full access to Snoot for the ASF, why not >>>>> take advantage of that and add a statistics page to projects.apache.org, >>>>> showing the various live stats available (no. of commits/committers, >>>>> largest repos by size/commits, proper language breakdown, relationship >>>>> mapping, mail stats etc). >>>>> >>>>> I was inclined to JFDI, but I'd love to hear what others think about >>>>> this. If I don't hear any loud objections, I'll add a stats page today, >>>>> and we can see if it's of any use :) >>>>> >>>>> Comments? Suggestions? :) >>>>> >>>>> With regards, >>>>> Daniel. >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: dev-unsubscr...@community.apache.org >>>>> For additional commands, e-mail: dev-h...@community.apache.org >>>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: dev-unsubscr...@community.apache.org >>>> For additional commands, e-mail: dev-h...@community.apache.org >>>> >>>> >>> >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: dev-unsubscr...@community.apache.org >>> For additional commands, e-mail: dev-h...@community.apache.org >>> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@community.apache.org >> For additional commands, e-mail: dev-h...@community.apache.org >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@community.apache.org > For additional commands, e-mail: dev-h...@community.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@community.apache.org For additional commands, e-mail: dev-h...@community.apache.org