Re: Measuring water quality of the CPAN river

2015-05-11 Thread Neil Bowers
> I think we could be considering a number of dimensions.  Brainstorming:
> 
> * fails tests now – i.e. blocking automated dep installations – this is your 
> existing metric
> * possibly an OS-specific variation of the above
> * stability over time – possibly your existing metric, but taking last "N" 
> stable releases or all stable releases in last "N” years

Related to this: stability of interface over time. We don’t really have 
metadata that can be used to track this automatically though.

> * something looking at bug queues – possibly open tickets proportional to 
> number of downriver dependents; possibly looking at open vs close ratios

One of the (rare) reasons I like RT over github is that bugs can be classified 
in a standard way. The downside is that not many people do. If we could be sure 
that an issue were a bug and not a wishlist item, typo in the doc, etc, then 
this would be a good thing to factor in.

Prompted by Kent’s response: could also factor in test coverage. It would be 
great if we had cpancover.com  evolved to a CPAN Testers 
type service, so we could have coverage smokers giving us coverage data on the 
mainstream operating systems and “supported” versions of Perl.

A couple of your points could be combined: only require a developer release if 
(enough) code was changed to warrant it.

Neil



Re: Measuring water quality of the CPAN river

2015-05-11 Thread Kent Fredric
On 11 May 2015 at 19:20, Neil Bowers  wrote:

> look at 2 or more CPAN Testers fails where the only difference is an
> upriver version number.


my point didn't pertain to upriver versions changing, but the observation
that upriver modules can have buggy code that is only broken on certain
architectures, and have no tests to expose that fact.

Thus, any thing consuming that module, regardless the version its at, will
have failure rates on CPAN for that architecture that the author of the
downstream module didn't anticipate.

But the problem is the upriver module, and its not a symptom exposed by
upriver *changes*, but fundamental issues in that upriver was *alway*
broken.

Most common case of this I've seen is when upriver is only coded so it
works on a specific NV size, and on different NV sizes , behaviour is
unpredictable.

Authors of downstream modules who have the magical NV size don't notice the
problem and ship.

And soon after, they get failures pertaining to their NV size *IF* they had
tests.

This can go on and on and you can get things 3+ deps away exposing an error
that is truely in an upstream module simply due to the intermediate modules
not exposing it due to their absence of tests.

Obviously the accuracy of any such metric gets weaker the further it gets
from the real point of origin. And even at D=2, its already rather vague.



-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL


Re: Measuring water quality of the CPAN river

2015-05-11 Thread Neil Bowers
> On 11 May 2015, at 01:47, Kent Fredric  wrote:
> So the quality of a dist could be measured indirectly by the failure rate of 
> its dependents.

This was kind of the basis of the “River Smoker” idea that Tux and and I 
discussed late on the last day of the QAH:

http://neilb.org/2015/04/24/cpan-downstream-testing.html

> Or as an analogy, we have 2 sampling points in the river 100m apart.
> 
> If we sample at point 1, we can't sense the fecal matter because it enters 
> the river downstream of the sampling point.
> 
> But the fecal matter is sampled at point 2, so by conjecture, either point 2 
> created it, or it entered between points 1 and 2.
> 
> Sampling across the river at point 2 helps infer where the probable fecal 
> matter is.

Sort of a cpan bisest, or rather a cpan testers bisect: look at 2 or more CPAN 
Testers fails where the only difference is an upriver version number.

Neil