Re: GCC Buildbot Update - Definition of regression

Christophe Lyon Wed, 11 Oct 2017 02:16:29 -0700

On 11 October 2017 at 11:03, Paulo Matos <pmatos@linki.tools> wrote:
>
>
> On 11/10/17 10:35, Christophe Lyon wrote:
>>
>> FWIW, we consider regressions:
>> * any->FAIL because we don't want such a regression at the whole testsuite 
>> level
>> * any->UNRESOLVED for the same reason
>> * {PASS,UNSUPPORTED,UNTESTED,UNRESOLVED}-> XPASS
>> * new XPASS
>> * XFAIL disappears (may mean that a testcase was removed, worth a manual 
>> check)
>> * ERRORS
>>
>
> That's certainly stricter than what it was proposed by Joseph. I will
> run a few tests on historical data to see what I get using both approaches.
>
>>
>>
>>>> ERRORs in the .sum or .log files should be watched out for as well,
>>>> however, as sometimes they may indicate broken Tcl syntax in the
>>>> testsuite, which may cause many tests not to be run.
>>>>
>>>> Note that the test names that come after PASS:, FAIL: etc. aren't unique
>>>> between different .sum files, so you need to associate tests with a tuple
>>>> (.sum file, test name) (and even then, sometimes multiple tests in a .sum
>>>> file have the same name, but that's a testsuite bug).  If you're using
>>>> --target_board options that run tests for more than one multilib in the
>>>> same testsuite run, add the multilib to that tuple as well.
>>>>
>>>
>>> Thanks for all the comments. Sounds sensible.
>>> By not being unique, you mean between languages?
>> Yes, but not only as Joseph mentioned above.
>>
>> You have the obvious example of c-c++-common/*san tests, which are
>> common to gcc and g++.
>>
>>> I assume that two gcc.sum from different builds will always refer to the
>>> same test/configuration when referring to (for example):
>>> PASS: gcc.c-torture/compile/20000105-1.c   -O1  (test for excess errors)
>>>
>>> In this case, I assume that "gcc.c-torture/compile/20000105-1.c   -O1
>>> (test for excess errors)" will always be referring to the same thing.
>>>
>> In gcc.sum, I can see 4 occurrences of
>> PASS: gcc.dg/Werror-13.c  (test for errors, line )
>>
>> Actually, there are quite a few others like that....
>>
>
> That actually surprised me.
>
> I also see:
> PASS: gcc.dg/Werror-13.c  (test for errors, line )
> PASS: gcc.dg/Werror-13.c  (test for errors, line )
> PASS: gcc.dg/Werror-13.c  (test for errors, line )
> PASS: gcc.dg/Werror-13.c  (test for errors, line )
>
> among others like it. Looks like a line number is missing?
>
> In any case, it feels like the code I have to track this down needs to
> be improved.
>
We had to derive our scripts from the ones in contrib/ because these
failed to handle some cases (eg when a same test reports
both PASS and FAIL, yes it does happen).


You can have a look at
https://git.linaro.org/toolchain/gcc-compare-results.git/
where compare_tests is a patched version of the contrib/ script,
it calls the main perl script (which is not the prettiest thing :-)

Christophe

> --
> Paulo Matos

Re: GCC Buildbot Update - Definition of regression

Reply via email to