Hey all,

as you might know, we've set up an internal CI system that is running `make
check` on a variety of different platforms and configurations, 16 in total.

As we've experienced more and more pain maintaining a green master, I've
compiled some statistics about which tests are most flaky. I thought other
people might also be interested to have a look at that data:

Last Week:

    # CI Statistics since 2018-10-05 14:22:35.422882 for branches
containing 'asf/master'
    Total: 41 failing tests, 28 unique. (avg 0.142361111111 failing tests
per build)

    Top 5 failing tests:
    6x: [empty]
    4x: ResourceStatistics
    2x: CreateDestroyDiskRecovery
    2x: INTERNET_CURL_InvokeFetchByName
    2x: RecoverNestedContainer

Last Month:

    # CI Statistics since 2018-09-12 14:23:36.272031 for branches
containing 'asf/master'
    Total: 320 failing tests, 75 unique. (avg 0.285714285714 failing tests
per build)

    Top 5 failing tests:
    57x: Used
    32x: LongLivedDefaultExecutorRestart
    27x: PythonFramework
    23x: ROOT_CGROUPS_LaunchNestedContainerSessionsInParallel
    22x: ResourceStatistics

Last year:

    # CI Statistics since 2017-10-12 14:24:31.639792 for branches
containing 'asf/master'
    Total: 3045 failing tests, 225 unique. (avg 0.184054642166 failing
tests per build)

    Top 5 failing tests:
    292x: [empty]
    272x: ROOT_LOGROTATE_UNPRIVILEGED_USER_RotateWithSwitchUserTrueOrFalse
    136x: LOGROTATE_RotateInSandbox
    136x: LOGROTATE_CustomRotateOptions
    131x: ResourceStatistics


I don't really have a point with all of this, but some observations:
 - [empty] means that the `mesos-tests` binary crashed
 - The data also includes "real", i.e. non-flaky test failures, but they
should not appear in the top 5 lists because we would hopefully either
revert or fix them before they can accumulate dozens of failures
 - Over the whole year, we seem to be pretty good at fixing  the nastiest
flakes, with only one of the top 5 still appearing in this weeks test
results
 - Sadly, the fail percentage isn't as different between now and then as we
might have hoped.

Hope this was interesting, and best regards,
-- 
Benno Evers
Software Engineer, Mesosphere

Reply via email to