Re: [webkit-dev] Iterating SunSpider

Maciej Stachowiak Sat, 04 Jul 2009 15:30:13 -0700


On Jul 4, 2009, at 1:06 PM, Peter Kasting wrote:

On Sat, Jul 4, 2009 at 11:47 AM, Mike Belshe <m...@belshe.com> wrote:
#3: The SunSpider harness has a variance problem due to CPU powersavings modes.
This one worries me because it decreases the consistency/reproducibility of test scores and makes it harder to compareengines or to track one engine's scores over time. For example,doing a bunch of CPU work just before running the benchmark canaffect whether and when the CPU throttles down during the benchmarkrun.
Possible solution:
The dromaeo test suite already incorporates the SunSpider individualtests under a new benchmark harness which fixes all 3 of the aboveissues. Thus, one approach would be to retire SunSpider 0.9 infavor of Dromaeo. http://dromaeo.com/?sunspider Dromaeo has alsodone a lot of good work to ensure statistical significance of theresults. Once we have a better benchmarking framework, it would begreat to build a new microbenchmark mix which more realisticallyexercises today's JavaScript.
One complaint I have heard about the Dromaeo tests (not the harness)is that the actual JS that gets run differs from browser to browser(e.g. because it is a direct copy of a source library that does UAsniffing). If this is true it means that this suite as-is isn'tuseful to compare engines to each other.
However, the Dromaeo _harness_ is probably a win as-is.
Of course, changing anything about Sunspider raises the question oftracking historical performance. Perhaps the harness could supportversioning, or perhaps people are simply willing to say "Sunspider1.0 scores cannot be compared to Sunspider 0.9 scores". I believethis is the approach the V8 benchmark takes.

I think versioning the test content is right, and I think we should dothat over time. I think a harness change to avoid triggeringpowersaving mode on Windows would be a reasonable thing to do to theharness without a version change. I don't think Dromaeo is a goodchoice of harness - I don't think their results are stable enough andI am not confident in the statistical soundness of their methodology.


Regards,
Maciej

_______________________________________________
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

Re: [webkit-dev] Iterating SunSpider

Reply via email to