On 01/15/2014 12:59 PM, Fabrice Desré wrote:
That looks very dependent on the performance of the test runner. How do
we take that into account?

We could use operating system facilities to detect and mark (or even abort) test runs where the test runner is not meeting certain performance guarantees and treat them as a build/infra failure rather than a test failure. Specifically, if we find that our processes are not getting sufficient CPU resources or the I/O subsystems are taking unacceptably long, we discard the test results entirely.

For example, the linux kernel tracing mechanism provides sched_switch and sched_wakeup trace-points. This can be used to know when a process/thread is actively running on the CPU, when it wishes it was running ('R' state), when it starts blocking on something ('D'/'S'), when it wakes up because what it was blocking on. Inference of what it was blocking on is also possible.

In general, there's lots of fanciness possible using systemtap (ex: https://sourceware.org/systemtap/examples/#profiling/latencytap.stp) and things like chromium's trace-viewer which has JS code to parse the linux "perf" tool's logs (https://chromium.googlesource.com/external/trace-viewer/+/master/src/tracing/importer/ look for the "linux_perf" bits).

However, the simplest/most pragmatic thing to do initially is probably just to use the existing linux perf tool to generate logs of what's happening during the test runs using "perf schedule" and when we see a failure due to event loop sadness, we look at that log and decide if it was a real failure and what we can do about it. There are a great set of examples at http://lwn.net/Articles/353295/

The obvious complicating factor is that virtualization can potentially rob the virtualized OS of CPU time without it knowing about it (in a useful way). This can be dealt with, but potentially is a hassle. This isn't an issue for linux containers, and there are trace-points for kvm_entry/kvm_exit. Xen appears to have some capabilities like this too: http://blog.xen.org/index.php/2012/09/27/tracing-with-xentrace-and-xenalyze/

Of course, in all cases this requires us to have sufficient privileges and/or control over the test-runners to be capable of using these tools. I'm not sure if that's the case for travis, but my understanding is that the continuous integration server project :lightsofapollo and friends are working on may change that situation.

Andrew
_______________________________________________
dev-b2g mailing list
dev-b2g@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-b2g

Reply via email to