On Thursday, January 31, 2013 9:17:44 AM UTC-8, Joshua Cranmer wrote:
> On 1/31/2013 10:51 AM, Ehsan Akhgari wrote:
> > On 2013-01-31 11:43 AM, Kyle Huey wrote:
> >> On Wed, Jan 30, 2013 at 8:03 PM, Ehsan Akhgari <ehsan.akhg...@gmail.com
> >> <mailto:ehsan.akhg...@gmail.com>> wrote:
> >>
> >>     We then tried to get a sense of how much of a win the PGO
> >>     optimizations are.  Thanks to a series of measurements by dmandelin,
> >>     we know that disabling PGO/LTCG will result in a regression of about
> >>     10-20% on benchmarks which examine DOM and layout performance such
> >>     as Dromaeo and guimark2 (and 40% in one case), but no significant
> >>     regressions in the startup time, and gmail interactions. Thanks to
> >>     a series of telemetry measurements performed by Vladan on a Nightly
> >>     build we did last week which had PGO/LTCG disabled, there are no
> >>     telemetry probes which show a significant regression on builds
> >>     without PGO/LTCG.  Vladan is going to try to get this data out of a
> >>     Tp5 run tomorrow as well, but we don't have any evidence to believe
> >>     that the results of that experiments will be any different.
> >>
> >> Isn't PGO worth something like 15% on Ts?
> 
> > That was what I thought, but local measurements performed by dmandelin 
> > proved otherwise.
> 
> For what it's worth, reading 
> <https://bugzilla.mozilla.org/show_bug.cgi?id=833890>, I do not get the 
> impression that dmandelin "proved" otherwise. His startup tests have 
> very low statistical confidence (n=2, n=3), and someone who disclaims 
> his own findings. It may be evidence that PGO is not a Ts win, but it is 
> weak evidence at best. 

I could certainly run a larger number of trials to see what happens. In that 
case, I stopped because the min values for warm startup were about equal (and 
also happened to be about equal to other warm startup times I had measured 
recently). For many timed benchmarks, "base value + positive random noise" 
seems like a good model, in which case mins seem like good things to compare.

> Our Talos results may be measuring imperfect things, but we have 
> enough datapoints that we can draw statistical conclusions from 
> them confidently. 

Statistics doesn't help if you're measuring the wrong things. Whether Ts is 
measuring the wrong thing, I don't know. It would be possible to learn 
something about that question by measuring startup with a camera, Telemetry 
simple measures, and Talos on the same machine and seeing how they compare.

By the way, there is a project (in a very early phase now) to do accurate 
measurements of startup time, both cold and warm, on machines that model user 
hardware, etc.

Dave
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to