# from Andreas J. Koenig
# on Tuesday 09 September 2008 00:09:

>OK, I walk you through them. First off, there are ten cases in the
>file I sent you.
>...
>  So the above is a case where it's impossible to judge without
>  looking at the report but at the same time we cannot have any
>  expectations about a single event when the previous outcome was
>  diverse. Let's call it a case DUNNO.
>
>CGI-Application-Plugin-ValidateRM-2.2  0.2808      FAIL  18
>CGI-Application-Plugin-ValidateRM-2.2  0.2808_03   FAIL  2 
>
>  Seems like the exact right behaviour. Let's call it a case OK.
> ...
>So to sum up, we have found that two of the ten support the view that
>_03 is doing fine, one appears to be against but is proved wrong, so
>seven remaining are simply DUNNOs that we can ignore because they do
>not indicate that we have to doubt.
>
>  > Did you chase-down several of those?
>
>No.

The judgement makes sense assuming an even scatter of machine profiles 
and etc.  If it is not actually possible to corelate 0.2808_03 results 
to 0.2808 results, then I suppose judgement is all we can get.  (Being 
the sort that I am, I would feel a lot better with some form of 1:1 
comparisons.)

>If somebody with strong statistics fu can measure the trustworthyness
>of the data in fovor of a releasse, please speak up.

As something to consider for cpantesters 2.0, the ability to analyze 
this sort of thing without statistics would be very useful.

>  > 2.  Where are these reports coming from?
>
>I have said it, I have (well, CPAN::Testers::ParseReport has)
>downloaded 56000 reports from
>http://www.nntp.perl.org/group/perl.cpan.testers/

No, I meant what *testers*.  How are the alpha versions getting 
installed?  Is it manually, via some option in the automated smoke 
tools, or what?  I have been under the impression that alpha 
dependencies never got installed automatically.

>If dev releases pummel other authors it's a call for better tests. If
>your tests are good, then release early, release often and watch the
>results on cpantesters. The point of cpantesters for toolchain
>modules: they may not only watch their own but all test results where
>they might be involved.

How does this process work?  If I release an alpha of M::B with a bug, 
how long before that irritates, distracts, and confuses a bunch of 
other authors?  Meanwhile, I have to watch test results for 2000+ other 
dists to find it?

What is triggering the testing of dists that use M::B though?  Is it 
only a newly uploaded dist?

Yes, Module::Build needs better tests.  It also needs somebody with the 
time to write them.  (If Devel::Cover worked, I imagine it would tell 
me that the coverage is rather low.)

If I had the time, before a release I would run the M::B tests on 
multiple platforms and perl versions, then for each of those run 
through a group of installs for dists known to use Module::Build with 
known results from a previous run.  Those results could be pass or 
fail -- the metric is whether the same dist does exactly the same thing 
(e.g. build ok and fail test X, etc) with both versions of M::B.  That 
would be what I would consider a controlled test with quantifiable 
results.  Granted, you cannot prove a negative, but the scientific 
method would give me a lot more confidence than "probably is okay".

Thanks,
Eric
-- 
I arise in the morning torn between a desire to improve the world and a
desire to enjoy the world. This makes it hard to plan the day.
--E.B. White
---------------------------------------------------
    http://scratchcomputing.com
---------------------------------------------------

Reply via email to