Re: GCC Buildbot Update - Definition of regression

Paulo Matos Tue, 10 Oct 2017 23:35:16 -0700


On 10/10/17 23:25, Joseph Myers wrote:
> On Tue, 10 Oct 2017, Paulo Matos wrote:
> 
>>     new test -> FAIL        ; New test starts as fail
> 
> No, that's not a regression, but you might want to treat it as one (in the 
> sense that it's a regression at the higher level of "testsuite run should 
> have no unexpected failures", even if the test in question would have 
> failed all along if added earlier and so the underlying compiler bug, if 
> any, is not a regression).  It should have human attention to classify it 
> and either fix the test or XFAIL it (with issue filed in Bugzilla if a 
> bug), but it's not a regression.  (Exception: where a test failing results 
> in its name changing, e.g. through adding "(internal compiler error)".)
>


When someone adds a new test to the testsuite, isn't it supposed to not
FAIL? If is does FAIL, shouldn't this be considered a regression?

Now, the danger is that since regressions are comparisons with previous
run something like this would happen:

run1:
...
FAIL: foo.c ; new test
...

run1 fails because new test entered as a FAIL

run2:
...
FAIL: foo.c
...

run2 succeeds because there are no changes.

For this reason all of this issues need to be taken care straight away
or they become part of the 'normal' status and no more failures are
issued... unless of course a more complex regression analysis is
implemented.

Also, when I mean, run1 fails or succeeds this is just the term I use to
display red/green in the buildbot interface for a given build, not
necessarily what I expect the process will do.

> 
> My suggestion is:
> 
> PASS -> FAIL is an unambiguous regression.
> 
> Anything else -> FAIL and new FAILing tests aren't regressions at the 
> individual test level, but may be treated as such at the whole testsuite 
> level.
> 
> Any transition where the destination result is not FAIL is not a 
> regression.
> 
> ERRORs in the .sum or .log files should be watched out for as well, 
> however, as sometimes they may indicate broken Tcl syntax in the 
> testsuite, which may cause many tests not to be run.
> 
> Note that the test names that come after PASS:, FAIL: etc. aren't unique 
> between different .sum files, so you need to associate tests with a tuple 
> (.sum file, test name) (and even then, sometimes multiple tests in a .sum 
> file have the same name, but that's a testsuite bug).  If you're using 
> --target_board options that run tests for more than one multilib in the 
> same testsuite run, add the multilib to that tuple as well.
> 

Thanks for all the comments. Sounds sensible.
By not being unique, you mean between languages?
I assume that two gcc.sum from different builds will always refer to the
same test/configuration when referring to (for example):
PASS: gcc.c-torture/compile/20000105-1.c   -O1  (test for excess errors)

In this case, I assume that "gcc.c-torture/compile/20000105-1.c   -O1
(test for excess errors)" will always be referring to the same thing.

-- 
Paulo Matos

Re: GCC Buildbot Update - Definition of regression

Reply via email to