On 10/21/2017 10:50 PM, Gavin McDonald wrote:
> How about some fine grained control over what metric people/projects want to 
> collect and display?
> 
> Supposing Kibble collects data on 100 different metrics , how about you do 
> not 
> turn on and collect and display all 100, but rather a common subset, the rest 
> can be turned on/off at will.

Data is typically only collected once, thus the overhead of "everything"
is minimized over time. Likewise, only the data you wish to have
displayed is fetched from the database when creating a visualization.
That's what date ranges, views and sub-filters are for.

With most data sources, you sort of have to grab everything before you
can even know who is who and what is when, so this would pose some
logistical and logical difficulties. To name some examples, For git, you
have to clone the entire repo to be able to get anything, and with
email, you have to download the full month to be able to see who posted
what, before you can start filtering it.

It would be possible for some source analyses to say "only collect new
stuff, don't go back to 1995, with some sort of "minimum age"
requirement built into a source's meta-data.

Might be worth making a ticket in kibble-scanners about this, though I
suspect most projects would want a complete history when they analyze a
source.

With regards,
Daniel.

> 
> Obviously the’100’ metrics should be configurable in terms of what data is 
> searched for and collected, in addition to what is displayed.
>   To expand the previous sentence - if someone fine tunes a metric to only 
> display last last n posts by people alb,c from list X, then perhaps make it 
> so 
> that only that data asked for is collected. (rather than fetch 1000 posts 
> when 
> only 6 are required for display)
> 
> Gav…
> 
>> On 22 Oct 2017, at 5:54 am, Daniel Gruno <[email protected]> wrote:
>>
>> On 10/21/2017 06:11 PM, Rich Bowen wrote:
>>> I think it's important that we are neutral in terms of saying that a
>>> particular metric is important, useful, necessary, so on. That is, while it
>>> may be obvious to us that a more corporately-diverse developer pool is a
>>> good thing, that is an opinion based in dogma, not in science. What's a
>>> "good" value for a particular metric is a matter of philosophy rather than
>>> science. What constitutes a healthy community is, also.
>>>
>>> The software should be awesome at collecting data and displaying trends.
>>> The specific community in question is responsible for interpreting what
>>> they mean, in the context of their own community, and what they want to do
>>> about it.
>>>
>>> The question of "here's what you should do to make your community more
>>> healthy" is amazingly complicated, and while that may be a goal some day,
>>> it's in a much later version.
>>>
>>> This also implies that we should be asking projects (LOTS of them) what
>>> metrics/trends they wish they had a tool to track, and provide those tools.
>>> We should also be asking them what correlations they want to see and add
>>> those tools. Things like "when I make a release I get more downloads" or
>>> "when I add N new committers my tickets get closed slower/faster" or
>>> whatever. We don't know what they want to know, and if we assume that we
>>> do, we'll be missing an opportunity.
>>
>> Big +1 to this.
>> I think Hervé was more focused on the practicalities here, and I think
>> both aspects are important. We both want something that works and is
>> tangible and accessible, but we also want the more qualitative and
>> anecdotal information out there, not just cold numbers.
>>
>> I know you have a hectic schedule, but it would be awesome if we could
>> come up with some emails to send to the ASF projects for starters, and
>> gauge what they feel are good metrics to keep an eye on as a community -
>> maybe even get some "behind the scenes" commentary on how certain
>> metrics have correlated to ups/downs of various aspects of the projects.
>> Not so we can say "your project is doing well or bad", but so we can
>> provide information on certain metrics and leave it to projects to act
>> based on metrics, information about health/diversity and historical
>> correlations (aka make an informed opinion).
>>
>> With regards,
>> Daniel.
>>
>>>
>>>
>>> On Sat, Oct 21, 2017 at 12:57 PM, Hervé BOUTEMY <[email protected]>
>>> wrote:
>>>
>>>> to me, ideally, Kibble 1.0 would be when it has the features required to
>>>> replace Snoot in Apache Projects Statistics [1] (Snoot service could use
>>>> Kibble 1.0 as its code)
>>>>
>>>> Looking at the "Data Points" page in Kibble demo [2], it seems we're not so
>>>> far: release early, release often, adding features not available in Snoot
>>>> for
>>>> projects.a.o would be for next versions
>>>>
>>>> Regards,
>>>>
>>>> Hervé
>>>>
>>>> [1] https://projects.apache.org/statistics.html
>>>>
>>>> [2] https://demo.kibble.apache.org/dashboard.html?page=repos
>>>>
>>>> Le vendredi 20 octobre 2017, 16:17:47 CEST Daniel Gruno a écrit :
>>>>> I'd like to kick off a larger discussion around what we hope Kibble can
>>>>> achieve, and how this will come about.
>>>>>
>>>>> For starters, what sort of data should we collect and display, what
>>>>> types of visualizations should we offer, and are there special formulas
>>>>> or algorithms (like Pony Factor) that we'd like to see. Which internal
>>>>> features should we be using (such as account linking/collating or
>>>>> collapsable groups of repos based on regex etc)?
>>>>>
>>>>> Then comes the bigger points: At what point would we consider Kibble
>>>>> good enough for a release? What things MUST we have before we can go out
>>>>> and say "hey, we've got this amazing tool, check it out!"?
>>>>>
>>>>> On a similar note, I (and probably Rich too) will be reaching out to the
>>>>> CHAOSS project over at LF ( http://chaoss.community/ ) to see if we can
>>>>> work out some specifications and standards with them, for use in Kibble.
>>>>>
>>>>> With regards,
>>>>> Daniel.
>>>>
>>>>
>>>>
>>>
>>
> 

Reply via email to