GCC Buildbot Update - Definition of regression

Paulo Matos Tue, 10 Oct 2017 12:45:31 -0700

Hi all,

It's almost 3 weeks since I last posted on GCC Buildbot. Here's an update:


* 3 x86_64 workers from CF are now installed;
* There's one scheduler for trunk doing fresh builds for every Daily bump;
* One scheduler doing incremental builds for each active branch;
* An IRC bot which is currently silent;

The next steps are:
* Enable LNT (I have installed this but have yet to connect to buildbot)
for tracking performance benchmarks over time -- it should come up as
http://gcc-lnt.linki.tools in the near future.
* Enable regression analysis --- This is fundamental. I understand that
without this the buildbot is pretty useless so it has highest priority.
However, I would like some agreement as to what in GCC should be
considered a regression. Each test in deja gnu can have several status:
FAIL, PASS, UNSUPPORTED, UNTESTED, XPASS, KPASS, XFAIL, KFAIL, UNRESOLVED

Since GCC doesn't have a 'clean bill' of test results we need to analyse
the sum files for the current run and compare with the last run of the
same branch. I have written down that if for each test there's a
transition that looks like the following, then a regression exists and
the test run should be marked as failure.

    ANY -> no test  ; Test disappears
    ANY / XPASS -> XPASS    ; Test goes from any status other than XPASS
to XPASS
    ANY / KPASS -> KPASS    ; Test goes from any status other than KPASS
to KPASS
    new test -> FAIL        ; New test starts as fail
    PASS -> ANY             ; Test moves away from PASS

This is a suggestion. I am keen to have corrections from people who use
this on a daily basis and/or have a better understanding of each status.

As soon as we reach a consensus, I will deploy this analysis and enable
IRC bot to notify on the #gcc channel the results of the tests.

-- 
Paulo Matos

GCC Buildbot Update - Definition of regression

Reply via email to