Re: B2G emulator issues
I wrote in April: The B2G emulator design is causing all sorts of problems. We just fixed the #2 orange which was caused by the Audio channel StartPlaying() taking up to 20 seconds to run (and we fixed it by effectively removing some timeouts). However, we just wasted half a week trying to land AEC MediaStreamGraph improvements. We still haven't landed due to yet another B2G emulator orange, but the solution we used for the M10 problem doesn't fix the fundamental problems with B2G emulator. You can read the earlier thread (starting 7-apr) about this issue. We wallpapered over the issues (including turning down 'fake' audio generation to 1/10th realtime and letting it underflow). The problems with the b2g emulator have just gotten worse as we add more tests and make changes to improve the system that give the emulators fits. Right now, we're looking at being blocked from landing important improvements (that make things *not* fail due to perf timeouts in real-user-scenarios) because b2g-emulator chokes on anything even smelling of realtime data. It can stall for 10's of seconds (see above), or even minutes. Even running a single test can cause other, unrelated tests to perma-orange. The stuff we've had to do (like turning down audio generation) to block oranges in the current setup makes the tests very non-real-world, and so greatly diminishes their utility anyways. There was work being done to move media and other semi-realtime tests to faster hardware; that is happening but it's not ready yet. (For reference, in April tests showed that a b2g emulator mochitest that took 10 seconds on my Xeon took 350-450 seconds on tbpl.) The fundamental problem is that b2g-emulator can't deal safely with any sort of realtime or semi-realtime data unless run on a fast machine. The architecture for the emulator setup means the effective CPU power is dependent on the machine running the test, and that varies a lot (and tbpl machines are WAY slower than my 2.5 year old desktop). Combine that with Debug being much slower, and it's recipe for disaster for any sort of time-dependent tests. ... So, what do we do? Because if we do nothing, it will only get worse. So we've done nothing (that's landed at least), and it has gotten worse, and we're at the breaking point where b2g emulator (especially debug) for media tests (especially webrtc) is providing negative value, and blocking critically important improvements. We've just landed bug 1059867 to disable most webrtc tests on the emulator until we can get them running on hardware that has the power to run them (or other fixes make them viable again (bug 1059878)). We may need to consider similar measures for other media tests (webaudio, etc). In the meantime, we're going to try to run local emulator pull/build/mochitest cronjobs on faster desktop machines (perhaps mine) on a daily or perhaps continuous basis. (Poor man's tbpl - maybe I'll un-mothball tinderbox for some nostalgic flames...) Also note that webrtc tests do run on the b2g desktop tbpl runs, so we have some coverage. I hope we can find a better solution than run it on my dev machine sometime soon (very soon!), but right now that's better than playing whack-a-random-timeout or just increasing run times to infinity. P.S. there are some interesting threads of stuff that could help a lot, like the comment Jay Wang made in April about SpecialPowers.exactGC taking 3-10s per instance on b2g debug, and tons of them being run (one test took 102s to finish, and had 90 gc's which mostly took ~10s each). Bug 1012516 -- Randell Jesup, Mozilla Corp remove news for personal email ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: B2G emulator issues
Some more details on how we're approaching this problem from the infrastructure side: Releng recently gave us the ability to run select jobs on faster VM's than the default, see https://bugzilla.mozilla.org/show_bug.cgi?id=1031083. We have B2G emulator media mochitests scheduled on cedar using these faster VM's. After fixing a minor problem with these, we'll be able to see if these faster VM's solve the problem. Local experiments suggest they do, but it will take a number of runs in buildbot to be sure. If that doesn't fix the problem, we have the option of trying still faster VM's (at greater cost), or trying to run the tests on real hardware. The disadvantage of running the tests on real hardware is that such hardware doesn't scale very readily and is already stretched pretty thin, and the emulator doesn't currently run on our linux hardware slaves, and will require some amount of work to fix. This work is being tracked in https://bugzilla.mozilla.org/show_bug.cgi?id=994920. Jonathan On 8/28/2014 3:06 PM, Randell Jesup wrote: I wrote in April: The B2G emulator design is causing all sorts of problems. We just fixed the #2 orange which was caused by the Audio channel StartPlaying() taking up to 20 seconds to run (and we fixed it by effectively removing some timeouts). However, we just wasted half a week trying to land AEC MediaStreamGraph improvements. We still haven't landed due to yet another B2G emulator orange, but the solution we used for the M10 problem doesn't fix the fundamental problems with B2G emulator. You can read the earlier thread (starting 7-apr) about this issue. We wallpapered over the issues (including turning down 'fake' audio generation to 1/10th realtime and letting it underflow). The problems with the b2g emulator have just gotten worse as we add more tests and make changes to improve the system that give the emulators fits. Right now, we're looking at being blocked from landing important improvements (that make things *not* fail due to perf timeouts in real-user-scenarios) because b2g-emulator chokes on anything even smelling of realtime data. It can stall for 10's of seconds (see above), or even minutes. Even running a single test can cause other, unrelated tests to perma-orange. The stuff we've had to do (like turning down audio generation) to block oranges in the current setup makes the tests very non-real-world, and so greatly diminishes their utility anyways. There was work being done to move media and other semi-realtime tests to faster hardware; that is happening but it's not ready yet. (For reference, in April tests showed that a b2g emulator mochitest that took 10 seconds on my Xeon took 350-450 seconds on tbpl.) The fundamental problem is that b2g-emulator can't deal safely with any sort of realtime or semi-realtime data unless run on a fast machine. The architecture for the emulator setup means the effective CPU power is dependent on the machine running the test, and that varies a lot (and tbpl machines are WAY slower than my 2.5 year old desktop). Combine that with Debug being much slower, and it's recipe for disaster for any sort of time-dependent tests. ... So, what do we do? Because if we do nothing, it will only get worse. So we've done nothing (that's landed at least), and it has gotten worse, and we're at the breaking point where b2g emulator (especially debug) for media tests (especially webrtc) is providing negative value, and blocking critically important improvements. We've just landed bug 1059867 to disable most webrtc tests on the emulator until we can get them running on hardware that has the power to run them (or other fixes make them viable again (bug 1059878)). We may need to consider similar measures for other media tests (webaudio, etc). In the meantime, we're going to try to run local emulator pull/build/mochitest cronjobs on faster desktop machines (perhaps mine) on a daily or perhaps continuous basis. (Poor man's tbpl - maybe I'll un-mothball tinderbox for some nostalgic flames...) Also note that webrtc tests do run on the b2g desktop tbpl runs, so we have some coverage. I hope we can find a better solution than run it on my dev machine sometime soon (very soon!), but right now that's better than playing whack-a-random-timeout or just increasing run times to infinity. P.S. there are some interesting threads of stuff that could help a lot, like the comment Jay Wang made in April about SpecialPowers.exactGC taking 3-10s per instance on b2g debug, and tons of them being run (one test took 102s to finish, and had 90 gc's which mostly took ~10s each). Bug 1012516 ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: B2G emulator issues
Thanks Jonathan for the update. I would like to point out that at least for the WebRTC tests which do test the connection between two WebRTC clients we theoretically also have the option to split the tests into two half's (which we do for steeplechase tests already anyhow), and then start two emulators on the same host machine. This could potentially speed up some of the test execution as the emulator seems to utilize only one host CPU. This approach would have the advantage of scaling well as we could keep using EC2. But this doe not help with the media issues Randel was describing below. But if we get to this point it might make sense to evaluate which test platform works best for which tests. Best Nils On 8/28/14 5:28 PM, Jonathan Griffin wrote: Some more details on how we're approaching this problem from the infrastructure side: Releng recently gave us the ability to run select jobs on faster VM's than the default, see https://bugzilla.mozilla.org/show_bug.cgi?id=1031083. We have B2G emulator media mochitests scheduled on cedar using these faster VM's. After fixing a minor problem with these, we'll be able to see if these faster VM's solve the problem. Local experiments suggest they do, but it will take a number of runs in buildbot to be sure. If that doesn't fix the problem, we have the option of trying still faster VM's (at greater cost), or trying to run the tests on real hardware. The disadvantage of running the tests on real hardware is that such hardware doesn't scale very readily and is already stretched pretty thin, and the emulator doesn't currently run on our linux hardware slaves, and will require some amount of work to fix. This work is being tracked in https://bugzilla.mozilla.org/show_bug.cgi?id=994920. Jonathan On 8/28/2014 3:06 PM, Randell Jesup wrote: I wrote in April: The B2G emulator design is causing all sorts of problems. We just fixed the #2 orange which was caused by the Audio channel StartPlaying() taking up to 20 seconds to run (and we fixed it by effectively removing some timeouts). However, we just wasted half a week trying to land AEC MediaStreamGraph improvements. We still haven't landed due to yet another B2G emulator orange, but the solution we used for the M10 problem doesn't fix the fundamental problems with B2G emulator. You can read the earlier thread (starting 7-apr) about this issue. We wallpapered over the issues (including turning down 'fake' audio generation to 1/10th realtime and letting it underflow). The problems with the b2g emulator have just gotten worse as we add more tests and make changes to improve the system that give the emulators fits. Right now, we're looking at being blocked from landing important improvements (that make things *not* fail due to perf timeouts in real-user-scenarios) because b2g-emulator chokes on anything even smelling of realtime data. It can stall for 10's of seconds (see above), or even minutes. Even running a single test can cause other, unrelated tests to perma-orange. The stuff we've had to do (like turning down audio generation) to block oranges in the current setup makes the tests very non-real-world, and so greatly diminishes their utility anyways. There was work being done to move media and other semi-realtime tests to faster hardware; that is happening but it's not ready yet. (For reference, in April tests showed that a b2g emulator mochitest that took 10 seconds on my Xeon took 350-450 seconds on tbpl.) The fundamental problem is that b2g-emulator can't deal safely with any sort of realtime or semi-realtime data unless run on a fast machine. The architecture for the emulator setup means the effective CPU power is dependent on the machine running the test, and that varies a lot (and tbpl machines are WAY slower than my 2.5 year old desktop). Combine that with Debug being much slower, and it's recipe for disaster for any sort of time-dependent tests. ... So, what do we do? Because if we do nothing, it will only get worse. So we've done nothing (that's landed at least), and it has gotten worse, and we're at the breaking point where b2g emulator (especially debug) for media tests (especially webrtc) is providing negative value, and blocking critically important improvements. We've just landed bug 1059867 to disable most webrtc tests on the emulator until we can get them running on hardware that has the power to run them (or other fixes make them viable again (bug 1059878)). We may need to consider similar measures for other media tests (webaudio, etc). In the meantime, we're going to try to run local emulator pull/build/mochitest cronjobs on faster desktop machines (perhaps mine) on a daily or perhaps continuous basis. (Poor man's tbpl - maybe I'll un-mothball tinderbox for some nostalgic flames...) Also note that webrtc tests do run on the b2g desktop tbpl runs, so we have some coverage. I hope we can find a better solution than run it on my dev
Re: B2G emulator issues
On Tuesday, April 8, 2014 11:45:15 PM UTC+8, Mike Habicher wrote: In my experience running tests locally, a single mochitest run on the ARM emulator (hardware: Thinkpad X220, 16GB RAM, SSD) where everything was built with 'B2G_DEBUG=0 B2G_NOOPT=0' will run in 2 to 3 minutes. The same test, run with 'B2G_DEBUG=1 B2G_NOOPT=0' will take 7 to 10 minutes. --m. It could be the same problem as Bug 1012516. test_media_selection.html can take up to 1025454ms on B2G ICS Emulator Debug. MediaManager will GC after finishing each token and this test has 90 tokens. It takes 10s * 90 = 900s in GC. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: B2G emulator issues
I ran crashtest/reftest/marionette/xpcshell/mochitest on emulator-x86-kk, have filed related bugs and make them block bug 753928. Basically: 1) need to carry --emulator x86 automatically (bug 996443) 2) to add x86 emulator for xpcshell tests (bug 996473) 3) PROCESS-CRASH at the end of reftest/crashtest (bug 996449) With some temporary solutions to above, all the test variants run on emulator-x86-kk and are about six times faster than ARM emulators. Best regards, Vicamo 於 4/9/14, 2:55 AM, Jonathan Griffin 提到: On 4/8/2014 1:05 AM, Thomas Zimmermann wrote: There are tests that instruct the emulator to trigger certain HW events. We can't run them on actual phones. To me, the idea of switching to a x86-based emulator seems to be the most promising solution. What would be necessary? Best regards Thomas We'd need these things: 1 - a consensus we want to move to x86-based emulators, which presumes that architecture-specific problems aren't likely or important enough to warrant continued use of arm-based emulators 2 - RelEng would need to stand up x86-based KitKat emulator builds 3 - The A*Team would need to get all of the tests running against these builds 4 - The A*Team and developers would have to work on fixing the inevitable test failures that occur when standing up any new platform I'll bring this topic up at the next B2G Engineering Meeting. Jonathan ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: B2G emulator issues
I ran crashtest/reftest/marionette/xpcshell/mochitest on emulator-x86-kk, have filed related bugs and make them block bug 753928. Basically: 1) need to carry --emulator x86 automatically (bug 996443) 2) to add x86 emulator for xpcshell tests (bug 996473) 3) PROCESS-CRASH at the end of reftest/crashtest (bug 996449) With some temporary solutions to above, all the test variants run on emulator-x86-kk and are about six times faster than ARM emulators. 6x is good, if everything works and the tools are all in place - though it means you're not running the real code used on devices, which could be a problem. 於 4/9/14, 2:55 AM, Jonathan Griffin 提到: On 4/8/2014 1:05 AM, Thomas Zimmermann wrote: There are tests that instruct the emulator to trigger certain HW events. We can't run them on actual phones. To me, the idea of switching to a x86-based emulator seems to be the most promising solution. What would be necessary? I don't think the *fundamental* problem is that the emulator is slow; I think it's that the emulator doesn't simulate the environment very well, and because of that, being slow (and running slow debug code) makes things break. Before worrying about x86 emulator (or going *too* far down the run it on faster hardware road), we should verify that faster hardware will produce less spurious oranges. Manually standing up a few testers and letting them run the mochitest load (even by hand) until we have enough data to see what moving the tests will do. I do think with the current emulator running it on faster hardware *will* help wallpaper the fundamental problems. I base this on my experience with the M10 media tests that began this thread - they ran fine on a 2.5 year old xeon (~10s, no timeouts) and took hundreds of seconds (and timed out) on the AWS testers. So moving the media tests will likely be a large win. But this (or x86) doesn't address the fundamental problem, which is that the emulator clearly isn't emulating the underlying envirment well, in particular timers (see some of the discussion in this thread). If we can address the fundamental problem (even crudely), the need for high-perf testers may decline or even go away. -- Randell Jesup, Mozilla Corp remove news for personal email ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: B2G emulator issues
於 4/15/14, 9:42 PM, Randell Jesup 提到: I ran crashtest/reftest/marionette/xpcshell/mochitest on emulator-x86-kk, have filed related bugs and make them block bug 753928. Basically: 1) need to carry --emulator x86 automatically (bug 996443) 2) to add x86 emulator for xpcshell tests (bug 996473) patch available, in review. 3) PROCESS-CRASH at the end of reftest/crashtest (bug 996449) Actually we have more trouble than this, but I think that can be improved with time. The top of the list should be the lack of gdb/gdbserver and maybe other debugging tools for x86 emulators. Rebuild AOSP toolchain doesn't seem to be a trivial task. :( With some temporary solutions to above, all the test variants run on emulator-x86-kk and are about six times faster than ARM emulators. 6x is good, if everything works and the tools are all in place - though it means you're not running the real code used on devices, which could be a problem. However, emulator is also not real code used on devices. ;) 於 4/9/14, 2:55 AM, Jonathan Griffin 提到: On 4/8/2014 1:05 AM, Thomas Zimmermann wrote: There are tests that instruct the emulator to trigger certain HW events. We can't run them on actual phones. To me, the idea of switching to a x86-based emulator seems to be the most promising solution. What would be necessary? I don't think the *fundamental* problem is that the emulator is slow; I think it's that the emulator doesn't simulate the environment very well, and because of that, being slow (and running slow debug code) makes things break. Before worrying about x86 emulator (or going *too* far down the run it on faster hardware road), we should verify that faster hardware will produce less spurious oranges. Manually standing up a few testers and letting them run the mochitest load (even by hand) until we have enough data to see what moving the tests will do. I do think with the current emulator running it on faster hardware *will* help wallpaper the fundamental problems. I base this on my experience with the M10 media tests that began this thread - they ran fine on a 2.5 year old xeon (~10s, no timeouts) and took hundreds of seconds (and timed out) on the AWS testers. So moving the media tests will likely be a large win. But this (or x86) doesn't address the fundamental problem, which is that the emulator clearly isn't emulating the underlying envirment well, in particular timers (see some of the discussion in this thread). If we can address the fundamental problem (even crudely), the need for high-perf testers may decline or even go away. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: B2G emulator issues
Hi That is what the emulator is already doing. If we start emulating HW down to individual CPU cycles, it'll only get slower. :( I think this is wrong in some way. Otherwise I wouldn't see this: 1) running on TBPL (AWS) the internal timings reported show the specific test going from 30 seconds to 450 seconds with the patch. 2) on my local system, the test self-reports ~10 seconds, with or without the patch. The only way I can see that happening is if the simulator in some way exposes the underlying platform performance (in specific timers). Right. What I mean is that we're currently emulating an ARM chipset, but without timing. If we start doing cycle-correct emulation, it won't get faster. Another option (likely not simple) would be to find a way to slow down time for the emulator, such as intercepting system calls and increasing any time constants (multiplying timer values, timeout values to socket calls, etc, etc). This may not be simple. For devices (audio, etc), frequencies may need modifying or other adjustments made. If we do that, writing and debugging tests will take even longer. It shouldn't, if the the system self-adapted (per below). That should give a much more predictable (and closer-to-similar to a real device) result. BTW, I presume we're simulating a single-core ARM, so again not entirely representative anymore. Oh, I now get the point of this idea. We could probably implement this by modifying the emulated timer(s?) within the emulator; hw/goldfish_timer.c might be the place. Although I wouldn't do this if we have other options. Don't know how this would affect frequencies (audio, etc.). Best regards Thomas We could require that the emulator needs X Bogomips to run, or to run a specific test suite. We could segment out tests that require higher performance and run them on faster VMs/etc. Do we already know which tests are slow and why? Maybe there are ways to optimize the emulator. For example, if we execute lots of driver code within the guest, maybe we can move some of that into the emulator's binary, where it runs on the native machine. Dunno. But it's REALLY slow. Native linux on tbpl for a specific test: 1s. Local emulator (fast 2year-old desktop) 10s. tbpl before patch 30-40s. after 350-450 and we're lucky it finishes at all. So compared to AWS linux native it's ~30-40x slower without the patch, 300+ x slower with. (Again speaks to realtime stuff leaving no CPU for test running on tbpl.) Others can speak to overall speed. We could turn off certain tests on tbpl and run them on separate dedicated test machines (a bit similar to PGO). There are downsides to this of course. Lastly, we could put in a bank of HW running B2G to run the tests like the Android test boards/phones. There are tests that instruct the emulator to trigger certain HW events. We can't run them on actual phones. Sure. Most don't do that I presume (very few) To me, the idea of switching to a x86-based emulator seems to be the most promising solution. What would be necessary? Dunno. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: B2G emulator issues
That is what the emulator is already doing. If we start emulating HW down to individual CPU cycles, it'll only get slower. :( I think this is wrong in some way. Otherwise I wouldn't see this: 1) running on TBPL (AWS) the internal timings reported show the specific test going from 30 seconds to 450 seconds with the patch. 2) on my local system, the test self-reports ~10 seconds, with or without the patch. The only way I can see that happening is if the simulator in some way exposes the underlying platform performance (in specific timers). Right. What I mean is that we're currently emulating an ARM chipset, but without timing. If we start doing cycle-correct emulation, it won't get faster. I still think there's a confusion here. How are timers connected? If I set a timer for 20ms, is that ARM instructions? (ignoring if they're cycle-accurate numbers or now, I'd be fine with assuming 1 instruction-per-cycle. Or is is 20ms on the host machine, regardless of how fast or slow the ARM emulation is running? (I think strongly it's the latter.) Another option (likely not simple) would be to find a way to slow down time for the emulator, such as intercepting system calls and increasing any time constants (multiplying timer values, timeout values to socket calls, etc, etc). This may not be simple. For devices (audio, etc), frequencies may need modifying or other adjustments made. If we do that, writing and debugging tests will take even longer. It shouldn't, if the the system self-adapted (per below). That should give a much more predictable (and closer-to-similar to a real device) result. BTW, I presume we're simulating a single-core ARM, so again not entirely representative anymore. Oh, I now get the point of this idea. We could probably implement this by modifying the emulated timer(s?) within the emulator; hw/goldfish_timer.c might be the place. Although I wouldn't do this if we have other options. Don't know how this would affect frequencies (audio, etc.). If we do this, it wouldn't really affect those - but from the host side the data coming out would be slow (or even fast on a fast machine). I wouldn't use this for interactive use for media (that gets even more fun); this would be an option to adjust timing or not. -- Randell Jesup, Mozilla Corp remove news for personal email ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: B2G emulator issues
Hi, Thanks for bringing up this issue. One option (very, very painful, and even slower) would be a proper device simulator which simulates both the CPU and the system hardware (of *some* B2G phone). This would produce the most realistic result with an emulator. That is what the emulator is already doing. If we start emulating HW down to individual CPU cycles, it'll only get slower. :( Another option (likely not simple) would be to find a way to slow down time for the emulator, such as intercepting system calls and increasing any time constants (multiplying timer values, timeout values to socket calls, etc, etc). This may not be simple. For devices (audio, etc), frequencies may need modifying or other adjustments made. If we do that, writing and debugging tests will take even longer. We could require that the emulator needs X Bogomips to run, or to run a specific test suite. We could segment out tests that require higher performance and run them on faster VMs/etc. Do we already know which tests are slow and why? Maybe there are ways to optimize the emulator. For example, if we execute lots of driver code within the guest, maybe we can move some of that into the emulator's binary, where it runs on the native machine. We could turn off certain tests on tbpl and run them on separate dedicated test machines (a bit similar to PGO). There are downsides to this of course. Lastly, we could put in a bank of HW running B2G to run the tests like the Android test boards/phones. There are tests that instruct the emulator to trigger certain HW events. We can't run them on actual phones. To me, the idea of switching to a x86-based emulator seems to be the most promising solution. What would be necessary? Best regards Thomas So, what do we do? Because if we do nothing, it will only get worse. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: B2G emulator issues
On 14-04-07 08:49 PM, Ehsan Akhgari wrote: On 2014-04-07, 8:03 PM, Robert O'Callahan wrote: When you say debug, you mean the emulator is running a FirefoxOS debug build, not that the emulator itself is built debug --- right? Given that, is it a correct summary to say that the problem is that the emulator is just too slow? Applying time dilation might make tests green but we'd be left with the problem of the tests still taking a long time to run. Maybe we should identify a subset of the tests that are more likely to suffer B2G-specific breaking and only run those? Do we disable all compiler optimizations for those debug builds? Can we turn them on, let's say, build with --enable-optimize and --enable-debug which gives us a -O2 optimized debug build? In my experience running tests locally, a single mochitest run on the ARM emulator (hardware: Thinkpad X220, 16GB RAM, SSD) where everything was built with 'B2G_DEBUG=0 B2G_NOOPT=0' will run in 2 to 3 minutes. The same test, run with 'B2G_DEBUG=1 B2G_NOOPT=0' will take 7 to 10 minutes. --m. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: B2G emulator issues
Hi, Thanks for bringing up this issue. One option (very, very painful, and even slower) would be a proper device simulator which simulates both the CPU and the system hardware (of *some* B2G phone). This would produce the most realistic result with an emulator. That is what the emulator is already doing. If we start emulating HW down to individual CPU cycles, it'll only get slower. :( I think this is wrong in some way. Otherwise I wouldn't see this: 1) running on TBPL (AWS) the internal timings reported show the specific test going from 30 seconds to 450 seconds with the patch. 2) on my local system, the test self-reports ~10 seconds, with or without the patch. The only way I can see that happening is if the simulator in some way exposes the underlying platform performance (in specific timers). Note: the timer in question is nsITimer::TYPE_REPEATING_PRECISE with 10ms timing. And changing it to 100ms makes the tests reliably green. Another option (likely not simple) would be to find a way to slow down time for the emulator, such as intercepting system calls and increasing any time constants (multiplying timer values, timeout values to socket calls, etc, etc). This may not be simple. For devices (audio, etc), frequencies may need modifying or other adjustments made. If we do that, writing and debugging tests will take even longer. It shouldn't, if the the system self-adapted (per below). That should give a much more predictable (and closer-to-similar to a real device) result. BTW, I presume we're simulating a single-core ARM, so again not entirely representative anymore. We could require that the emulator needs X Bogomips to run, or to run a specific test suite. We could segment out tests that require higher performance and run them on faster VMs/etc. Do we already know which tests are slow and why? Maybe there are ways to optimize the emulator. For example, if we execute lots of driver code within the guest, maybe we can move some of that into the emulator's binary, where it runs on the native machine. Dunno. But it's REALLY slow. Native linux on tbpl for a specific test: 1s. Local emulator (fast 2year-old desktop) 10s. tbpl before patch 30-40s. after 350-450 and we're lucky it finishes at all. So compared to AWS linux native it's ~30-40x slower without the patch, 300+ x slower with. (Again speaks to realtime stuff leaving no CPU for test running on tbpl.) Others can speak to overall speed. We could turn off certain tests on tbpl and run them on separate dedicated test machines (a bit similar to PGO). There are downsides to this of course. Lastly, we could put in a bank of HW running B2G to run the tests like the Android test boards/phones. There are tests that instruct the emulator to trigger certain HW events. We can't run them on actual phones. Sure. Most don't do that I presume (very few) To me, the idea of switching to a x86-based emulator seems to be the most promising solution. What would be necessary? Dunno. -- Randell Jesup, Mozilla Corp remove news for personal email ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: B2G emulator issues
On 4/8/2014 1:05 AM, Thomas Zimmermann wrote: There are tests that instruct the emulator to trigger certain HW events. We can't run them on actual phones. To me, the idea of switching to a x86-based emulator seems to be the most promising solution. What would be necessary? Best regards Thomas We'd need these things: 1 - a consensus we want to move to x86-based emulators, which presumes that architecture-specific problems aren't likely or important enough to warrant continued use of arm-based emulators 2 - RelEng would need to stand up x86-based KitKat emulator builds 3 - The A*Team would need to get all of the tests running against these builds 4 - The A*Team and developers would have to work on fixing the inevitable test failures that occur when standing up any new platform I'll bring this topic up at the next B2G Engineering Meeting. Jonathan ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: B2G emulator issues
Randell Jesup writes: 1) running on TBPL (AWS) the internal timings reported show the specific test going from 30 seconds to 450 seconds with the patch. 2) on my local system, the test self-reports ~10 seconds, with or without the patch. Note: the timer in question is nsITimer::TYPE_REPEATING_PRECISE with 10ms timing. And changing it to 100ms makes the tests reliably green. Do you know how many simultaneous hardware threads are emulated? Is it possible that the thread using TYPE_REPEATING_PRECISE has a high priority, and so it would occupy the single hardware thread when there is no spare time available for anything else? The time taken for the test run might depend on the anything else running. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
B2G emulator issues
The B2G emulator design is causing all sorts of problems. We just fixed the #2 orange which was caused by the Audio channel StartPlaying() taking up to 20 seconds to run (and we fixed it by effectively removing some timeouts). However, we just wasted half a week trying to land AEC MediaStreamGraph improvements. We still haven't landed due to yet another B2G emulator orange, but the solution we used for the M10 problem doesn't fix the fundamental problems with B2G emulator. Details: We ran into huge problems getting AEC/MediaStreamGraph changes (bug 818822 and things dependent on it) into the tree due to problems with B2g-emulator debug M10 (permaorange timeouts). This test adds a fairly small amount of processing to input audio data (resampling to 44100Hz). A test that runs perfectly in emulator opt builds and runs fine locally in M10 debug (10-12 seconds reported for the test in the logs, with or without the change), goes from taking 30-40 seconds on tbpl to 350-450(!) seconds (and then times out). Fix that one, and others fail even worse. I contacted Gregor Wagner asking for help and also jgriffin in #b2g. We found one problem (emulator going to 'sleep' during mochitests, bug 992436); I have a patch up to enable wakelock globally for mochitests. However, that just pushed the error a little deeper. The fundamental problem is that b2g-emulator can't deal safely with any sort of realtime or semi-realtime data unless run on a fast machine. The architecture for the emulator setup means the effective CPU power is dependent on the machine running the test, and that varies a lot (and tbpl machines are WAY slower than my 2.5 year old desktop). Combine that with Debug being much slower, and it's recipe for disaster for any sort of time-dependent tests. I worked around it for now, by turning down the timers that push fake realtime data into the system - this will cause audio underruns in MediaStreamGraph, and doesn't solve the problem of MediaStreamGraph potentially overloading itself for other reasons, or breaking assumptions about being able to keep up with data streams. (MSG wants to run every 10ms or so.) This problem also likely plays hell with the Web Audio tests, and will play hell with WebRTC echo cancellation and the media reception code, which will start trying to insert loss-concealment data and break timer-based packet loss recovery, bandwidth estimators, etc. As to what to do? That's a good question, as turning off the emulator tests isn't a realistic option. One option (very, very painful, and even slower) would be a proper device simulator which simulates both the CPU and the system hardware (of *some* B2G phone). This would produce the most realistic result with an emulator. Another option (likely not simple) would be to find a way to slow down time for the emulator, such as intercepting system calls and increasing any time constants (multiplying timer values, timeout values to socket calls, etc, etc). This may not be simple. For devices (audio, etc), frequencies may need modifying or other adjustments made. We could require that the emulator needs X Bogomips to run, or to run a specific test suite. We could segment out tests that require higher performance and run them on faster VMs/etc. We could turn off certain tests on tbpl and run them on separate dedicated test machines (a bit similar to PGO). There are downsides to this of course. Lastly, we could put in a bank of HW running B2G to run the tests like the Android test boards/phones. So, what do we do? Because if we do nothing, it will only get worse. -- Randell Jesup, Mozilla Corp remove news for personal email ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: B2G emulator issues
How easy is it to identify CPU-sensitive tests? I think the most practical solution (at least in the near term) is to find that set of tests, and run only that set on a faster VM, or on real hardware (like our ix slaves). Jonathan On 4/7/2014 3:16 PM, Randell Jesup wrote: The B2G emulator design is causing all sorts of problems. We just fixed the #2 orange which was caused by the Audio channel StartPlaying() taking up to 20 seconds to run (and we fixed it by effectively removing some timeouts). However, we just wasted half a week trying to land AEC MediaStreamGraph improvements. We still haven't landed due to yet another B2G emulator orange, but the solution we used for the M10 problem doesn't fix the fundamental problems with B2G emulator. Details: We ran into huge problems getting AEC/MediaStreamGraph changes (bug 818822 and things dependent on it) into the tree due to problems with B2g-emulator debug M10 (permaorange timeouts). This test adds a fairly small amount of processing to input audio data (resampling to 44100Hz). A test that runs perfectly in emulator opt builds and runs fine locally in M10 debug (10-12 seconds reported for the test in the logs, with or without the change), goes from taking 30-40 seconds on tbpl to 350-450(!) seconds (and then times out). Fix that one, and others fail even worse. I contacted Gregor Wagner asking for help and also jgriffin in #b2g. We found one problem (emulator going to 'sleep' during mochitests, bug 992436); I have a patch up to enable wakelock globally for mochitests. However, that just pushed the error a little deeper. The fundamental problem is that b2g-emulator can't deal safely with any sort of realtime or semi-realtime data unless run on a fast machine. The architecture for the emulator setup means the effective CPU power is dependent on the machine running the test, and that varies a lot (and tbpl machines are WAY slower than my 2.5 year old desktop). Combine that with Debug being much slower, and it's recipe for disaster for any sort of time-dependent tests. I worked around it for now, by turning down the timers that push fake realtime data into the system - this will cause audio underruns in MediaStreamGraph, and doesn't solve the problem of MediaStreamGraph potentially overloading itself for other reasons, or breaking assumptions about being able to keep up with data streams. (MSG wants to run every 10ms or so.) This problem also likely plays hell with the Web Audio tests, and will play hell with WebRTC echo cancellation and the media reception code, which will start trying to insert loss-concealment data and break timer-based packet loss recovery, bandwidth estimators, etc. As to what to do? That's a good question, as turning off the emulator tests isn't a realistic option. One option (very, very painful, and even slower) would be a proper device simulator which simulates both the CPU and the system hardware (of *some* B2G phone). This would produce the most realistic result with an emulator. Another option (likely not simple) would be to find a way to slow down time for the emulator, such as intercepting system calls and increasing any time constants (multiplying timer values, timeout values to socket calls, etc, etc). This may not be simple. For devices (audio, etc), frequencies may need modifying or other adjustments made. We could require that the emulator needs X Bogomips to run, or to run a specific test suite. We could segment out tests that require higher performance and run them on faster VMs/etc. We could turn off certain tests on tbpl and run them on separate dedicated test machines (a bit similar to PGO). There are downsides to this of course. Lastly, we could put in a bank of HW running B2G to run the tests like the Android test boards/phones. So, what do we do? Because if we do nothing, it will only get worse. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: B2G emulator issues
On 4/7/2014 3:16 PM, Randell Jesup wrote: The B2G emulator design is causing all sorts of problems. We just fixed That sounds very similar to some of the failures seen on the Android 2.3 emulator. Many media-related mochitests intermittently time out on the Android 2.3 emulator when run on aws. These are reported in bug 981889, bug 981886, bug 981881, and bug 981898, but have not been investigated. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: B2G emulator issues
How easy is it to identify CPU-sensitive tests? Easy for some (most but not all media tests). Almost all getUserMedia/PeerConnection tests. ICE/STUN/TURN tests. Not that easy for some. And some may be only indirectly sensitive - timeouts in delay-the-rendering code, TCP/DNS/SPDY timers, etc, etc. Anything that touches a timer even indirectly *could* be. So, large sections *could* be. I suppose we could include code checking for MainThread starvation as a partial check though that won't catch everything. I think the most practical solution (at least in the near term) is to find that set of tests, and run only that set on a faster VM, or on real hardware (like our ix slaves). That was an option I mentioned. It's not fun and will be a continual is this orange CPU sensitive? as they pop up, but it certainly can be done. And it may be simpler than better solutions. -- Randell Jesup, Mozilla Corp remove news for personal email ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: B2G emulator issues
When you say debug, you mean the emulator is running a FirefoxOS debug build, not that the emulator itself is built debug --- right? Given that, is it a correct summary to say that the problem is that the emulator is just too slow? Applying time dilation might make tests green but we'd be left with the problem of the tests still taking a long time to run. Maybe we should identify a subset of the tests that are more likely to suffer B2G-specific breaking and only run those? Rob -- Jtehsauts tshaei dS,o n Wohfy Mdaon yhoaus eanuttehrotraiitny eovni le atrhtohu gthot sf oirng iyvoeu rs ihnesa.rt sS?o Whhei csha iids teoa stiheer :p atroa lsyazye,d 'mYaonu,r sGients uapr,e tfaokreg iyvoeunr, 'm aotr atnod sgaoy ,h o'mGee.t uTph eann dt hwea lmka'n? gBoutt uIp waanndt wyeonut thoo mken.o w ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: B2G emulator issues
On 2014-04-07, 8:03 PM, Robert O'Callahan wrote: When you say debug, you mean the emulator is running a FirefoxOS debug build, not that the emulator itself is built debug --- right? Given that, is it a correct summary to say that the problem is that the emulator is just too slow? Applying time dilation might make tests green but we'd be left with the problem of the tests still taking a long time to run. Maybe we should identify a subset of the tests that are more likely to suffer B2G-specific breaking and only run those? Do we disable all compiler optimizations for those debug builds? Can we turn them on, let's say, build with --enable-optimize and --enable-debug which gives us a -O2 optimized debug build? ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: B2G emulator issues
Why don’t we just switch to x86 emulator? x86 emulator runs way faster than the ARM emulator. Best Regards, Shih-Chiang Chien Mozilla Taiwan On Apr 8, 2014, at 8:49 AM, Ehsan Akhgari ehsan.akhg...@gmail.com wrote: On 2014-04-07, 8:03 PM, Robert O'Callahan wrote: When you say debug, you mean the emulator is running a FirefoxOS debug build, not that the emulator itself is built debug --- right? Given that, is it a correct summary to say that the problem is that the emulator is just too slow? Applying time dilation might make tests green but we'd be left with the problem of the tests still taking a long time to run. Maybe we should identify a subset of the tests that are more likely to suffer B2G-specific breaking and only run those? Do we disable all compiler optimizations for those debug builds? Can we turn them on, let's say, build with --enable-optimize and --enable-debug which gives us a -O2 optimized debug build? ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform signature.asc Description: Message signed with OpenPGP using GPGMail ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform