Re: any concerns with dropping the talos test v8 and using AWFY Octane instead?

William Lachance Tue, 12 Jan 2016 10:30:43 -0800

On 2016-01-11 4:12 PM, Eric Rahm wrote:

On Monday, January 11, 2016 at 8:42:11 AM UTC-8, William Lachance wrote:

It seems like another alternative might be to run Octane in Talos,
instead of v8_7.


It seems like Talos has two advantages over AWFY (correct me if I'm wrong):

1. Easy for developers to schedule jobs via try (maybe less of a concern
with a benchmark like this, where I suspect results are more
reproducible locally?)


I believe there was talk of adding try support for AWFY (there already is for 
AWSY). Of course that's not actually done yet, I just want to point out it's 
not particularly hard and AWSY's version could be adapted rather easily.

2. More hardware available, so can get results faster.


I would guess we want to run on dedicated non-virtualized hardware for these 
tests. Is that an option w/ Talos? FWIW if that's an option I'd be more than 
happy to move AWSY over to the platform as well :)

Talos already runs on non-virtualized hardware. I don't see any inherentreason we couldn't rework AWSY as a Talos test. In general it feels tome like we should be running performance tests on relops-supportedinfrastructure where possible, as opposed to adhoc systems.

Thoughts? Incidentally one of my deliverables for this quarter is to try
to figure out how Perfherder, Talos, and AWFY should co-exist, so I'm
very interested in knowing if my assumptions above are correct.


Regardless of whether we use Talos to run the tests or not, it would be 
definitely be nice have the data reported in perfherder.

A digression, maybe worth followup in a separate thread:

In general it would be great if we could consolidate the various perf tests (AWFY, AWSY, Talos, Raptor, etc) under one umbrella 
(at least from an end user perspective). So you could go to trychooser and choose a "Perf" option that would have 
various subsets like: "JS Engine", "Memory Usage", "Layout Latency", "Mobile Launch 
Time", etc.

If all of these systems reported their data to perfherder (and optionally 
elsewhere) we'd now have one centralized location where you can track perf 
regressions. As an end-user this is pretty great: The graphs look the same 
across systems, I only have to learn how to use one tool, I only have to learn 
how to interpret regressions in one system.

Yes, this seems like a good long-term goal to me. There are a fewconstraints that Perfherder has that make it unsuitable for some use cases:

1. It assumes that all test machines of a particular class will beuniform, at least per test. For example, Autophone tracks theperformance of something like 9 different Android devices seperately(see: http://phonedash.mozilla.org/) -- that's not something Perfherderwas designed to do.2. It is designed to track performance changes in one product perrepository, not compare one against another. It's not designed tofacilitate comparisons between Firefox and Chrome.

That shouldn't stop us from using Perfherder as at least an optionalsubmission target for many systems though -- we've had good luck withautophone so far as a potential replacement for Android Talos on asingle device class, and it looks like AWSY should work as well.

AWFY is a bit of a different animal as it has its own regressiondetection/reporting system, in principle it makes sense to unify thatwith Perfherder but I'm not yet 100% sure what would be involved inmaking that happen -- AWFY supports some things that aren't onPerfherder's near-term roadmap (e.g. reporting a regression manually onan arbitrary revision), so we need to figure that out.


Will
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Re: any concerns with dropping the talos test v8 and using AWFY Octane instead?

Reply via email to