How easy is it to identify CPU-sensitive tests?

I think the most practical solution (at least in the near term) is to find that set of tests, and run only that set on a faster VM, or on real hardware (like our ix slaves).

Jonathan


On 4/7/2014 3:16 PM, Randell Jesup wrote:
The B2G emulator design is causing all sorts of problems.  We just fixed
the #2 orange which was caused by the Audio channel StartPlaying()
taking up to 20 seconds to run (and we "fixed" it by effectively
removing some timeouts).  However, we just wasted half a week trying to
land AEC & MediaStreamGraph improvements.  We still haven't landed due
to yet another B2G emulator orange, but the solution we used for the M10
problem doesn't fix the fundamental problems with B2G emulator.

Details:

We ran into huge problems getting AEC/MediaStreamGraph changes (bug
818822 and things dependent on it) into the tree due to problems with
B2g-emulator debug M10 (permaorange timeouts).  This test adds a fairly
small amount of processing to input audio data (resampling to 44100Hz).

A test that runs perfectly in emulator opt builds and runs fine locally
in M10 debug (10-12 seconds reported for the test in the logs, with or
without the change), goes from taking 30-40 seconds on tbpl to
350-450(!) seconds (and then times out).  Fix that one, and others fail
even worse.

I contacted Gregor Wagner asking for help and also jgriffin in #b2g.  We
found one problem (emulator going to 'sleep' during mochitests, bug
992436); I have a patch up to enable wakelock globally for mochitests.
However, that just pushed the error a little deeper.

The fundamental problem is that b2g-emulator can't deal safely with any
sort of realtime or semi-realtime data unless run on a fast machine.
The architecture for the emulator setup means the effective CPU power is
dependent on the machine running the test, and that varies a lot (and
tbpl machines are WAY slower than my 2.5 year old desktop).  Combine
that with Debug being much slower, and it's recipe for disaster for any
sort of time-dependent tests.

I worked around it for now, by turning down the timers that push fake
realtime data into the system - this will cause audio underruns in
MediaStreamGraph, and doesn't solve the problem of MediaStreamGraph
potentially overloading itself for other reasons, or breaking
assumptions about being able to keep up with data streams.  (MSG wants
to run every 10ms or so.)

This problem also likely plays hell with the Web Audio tests, and will
play hell with WebRTC echo cancellation and the media reception code,
which will start trying to insert loss-concealment data and break
timer-based packet loss recovery, bandwidth estimators, etc.


As to what to do?  That's a good question, as turning off the emulator
tests isn't a realistic option.

One option (very, very painful, and even slower) would be a proper
device simulator which simulates both the CPU and the system hardware
(of *some* B2G phone).  This would produce the most realistic result
with an emulator.

Another option (likely not simple) would be to find a way to "slow down
time" for the emulator, such as intercepting system calls and increasing
any time constants (multiplying timer values, timeout values to socket
calls, etc, etc).  This may not be simple.  For devices (audio, etc),
frequencies may need modifying or other adjustments made.

We could require that the emulator needs X Bogomips to run, or to run a
specific test suite.

We could segment out tests that require higher performance and run them
on faster VMs/etc.

We could turn off certain tests on tbpl and run them on separate
dedicated test machines (a bit similar to PGO).  There are downsides to
this of course.

Lastly, we could put in a bank of HW running B2G to run the tests like
the Android test boards/phones.


So, what do we do?  Because if we do nothing, it will only get worse.


_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to