On 11 October 2017 at 11:03, Paulo Matos <pmatos@linki.tools> wrote: > > > On 11/10/17 10:35, Christophe Lyon wrote: >> >> FWIW, we consider regressions: >> * any->FAIL because we don't want such a regression at the whole testsuite >> level >> * any->UNRESOLVED for the same reason >> * {PASS,UNSUPPORTED,UNTESTED,UNRESOLVED}-> XPASS >> * new XPASS >> * XFAIL disappears (may mean that a testcase was removed, worth a manual >> check) >> * ERRORS >> > > That's certainly stricter than what it was proposed by Joseph. I will > run a few tests on historical data to see what I get using both approaches. > >> >> >>>> ERRORs in the .sum or .log files should be watched out for as well, >>>> however, as sometimes they may indicate broken Tcl syntax in the >>>> testsuite, which may cause many tests not to be run. >>>> >>>> Note that the test names that come after PASS:, FAIL: etc. aren't unique >>>> between different .sum files, so you need to associate tests with a tuple >>>> (.sum file, test name) (and even then, sometimes multiple tests in a .sum >>>> file have the same name, but that's a testsuite bug). If you're using >>>> --target_board options that run tests for more than one multilib in the >>>> same testsuite run, add the multilib to that tuple as well. >>>> >>> >>> Thanks for all the comments. Sounds sensible. >>> By not being unique, you mean between languages? >> Yes, but not only as Joseph mentioned above. >> >> You have the obvious example of c-c++-common/*san tests, which are >> common to gcc and g++. >> >>> I assume that two gcc.sum from different builds will always refer to the >>> same test/configuration when referring to (for example): >>> PASS: gcc.c-torture/compile/20000105-1.c -O1 (test for excess errors) >>> >>> In this case, I assume that "gcc.c-torture/compile/20000105-1.c -O1 >>> (test for excess errors)" will always be referring to the same thing. >>> >> In gcc.sum, I can see 4 occurrences of >> PASS: gcc.dg/Werror-13.c (test for errors, line ) >> >> Actually, there are quite a few others like that.... >> > > That actually surprised me. > > I also see: > PASS: gcc.dg/Werror-13.c (test for errors, line ) > PASS: gcc.dg/Werror-13.c (test for errors, line ) > PASS: gcc.dg/Werror-13.c (test for errors, line ) > PASS: gcc.dg/Werror-13.c (test for errors, line ) > > among others like it. Looks like a line number is missing? > > In any case, it feels like the code I have to track this down needs to > be improved. > We had to derive our scripts from the ones in contrib/ because these failed to handle some cases (eg when a same test reports both PASS and FAIL, yes it does happen).
You can have a look at https://git.linaro.org/toolchain/gcc-compare-results.git/ where compare_tests is a patched version of the contrib/ script, it calls the main perl script (which is not the prettiest thing :-) Christophe > -- > Paulo Matos