>>>>> On Mon, 8 Sep 2008 16:36:00 -0700, Eric Wilhelm <[EMAIL PROTECTED]> said:
> # from Andreas J. Koenig > # on Monday 08 September 2008 15:16: >> Since yesterday I have downloaded and analysed ~56000 testreports from >> cpantesters and found ~135 distros that have been tested by both MB >> 0.2808 and 0.2808_03. There is only one result (Test-Group-0.12) that >> looks bad but it turns out to be due to broken Test::More 0.81_01. All >> others suggest that _03 is doing well. > Umm... okay. > 1. I see a lot of m/0.2808_03 +FAIL/ in there. OK, I walk you through them. First off, there are ten cases in the file I sent you. B-Generate-1.13 0.2808 FAIL 5 B-Generate-1.13 0.2808 PASS 6 B-Generate-1.13 0.2808_03 FAIL 1 So the above is a case where it's impossible to judge without looking at the report but at the same time we cannot have any expectations about a single event when the previous outcome was diverse. Let's call it a case DUNNO. CGI-Application-Plugin-ValidateRM-2.2 0.2808 FAIL 18 CGI-Application-Plugin-ValidateRM-2.2 0.2808_03 FAIL 2 Seems like the exact right behaviour. Let's call it a case OK. Devel-LeakTrace-0.05 0.2808 FAIL 43 Devel-LeakTrace-0.05 0.2808 PASS 6 Devel-LeakTrace-0.05 0.2808_03 FAIL 1 It's a DUNNO but likelihood is high that we need not look closer on this one. HTTP-Proxy-0.23 0.2808 FAIL 8 HTTP-Proxy-0.23 0.2808 PASS 5 HTTP-Proxy-0.23 0.2808_03 FAIL 6 HTTP-Proxy-0.23 0.2808_03 PASS 1 Although it's a DUNNO, the distribution between fail and pass is quite good. Math-BaseCalc-1.012 0.2808 FAIL 9 Math-BaseCalc-1.012 0.2808 PASS 9 Math-BaseCalc-1.012 0.2808_03 FAIL 1 A DUNNO. Metaweb-0.05 0.2808 FAIL 14 Metaweb-0.05 0.2808 PASS 10 Metaweb-0.05 0.2808_03 FAIL 1 DUNNO Parse-BACKPAN-Packages-0.33 0.2808 FAIL 18 Parse-BACKPAN-Packages-0.33 0.2808 PASS 8 Parse-BACKPAN-Packages-0.33 0.2808_03 FAIL 1 DUNNO Template-Plugin-Class-0.13 0.2808 FAIL 6 Template-Plugin-Class-0.13 0.2808 PASS 55 Template-Plugin-Class-0.13 0.2808_03 FAIL 1 DUNNO Test-Group-0.12 0.2808 PASS 47 Test-Group-0.12 0.2808_03 FAIL 1 A WHOAA THERE, that seems to indicate that something's wrong. But as I explained in the previous mail, this is due to Test-Simple dev release. Test-JSON-0.06 0.2808 FAIL 15 Test-JSON-0.06 0.2808 PASS 44 Test-JSON-0.06 0.2808_03 FAIL 1 A DUNNO again. So to sum up, we have found that two of the ten support the view that _03 is doing fine, one appears to be against but is proved wrong, so seven remaining are simply DUNNOs that we can ignore because they do not indicate that we have to doubt. > Did you chase-down several of those? No. > Are you saying that having > "some fail" on 0.2808 implies that "some fail" on 0.2808_03 means > no regression, or did you manage to somehow correlate the > 0.2808_03 fails to the same machines sending 0.2808 fails? As explained above, I used judgement. If somebody with strong statistics fu can measure the trustworthyness of the data in fovor of a releasse, please speak up. > 2. Where are these reports coming from? I have said it, I have (well, CPAN::Testers::ParseReport has) downloaded 56000 reports from http://www.nntp.perl.org/group/perl.cpan.testers/ > Again, the subject of false > fails: I would hate for testers to be pummelling other authors with > alpha M::B errors while the M::B maintainers are left blissfully > ignorant. <plug> Toolchain maintainers will probably want to use ctgetreports which comes with CPAN::Testers::ParseReport. </plug> If dev releases pummel other authors it's a call for better tests. If your tests are good, then release early, release often and watch the results on cpantesters. The point of cpantesters for toolchain modules: they may not only watch their own but all test results where they might be involved. -- andreas