Re: Benefits of PGO on Windows

2012-10-19 Thread Justin Lebar
> If we really wanted to know, either someone would have to spend some time
> doing this over and over, or we'd have to use Telemetry with some A/B testing.

This would actually be a pretty easy thing to do, to a first
approximation anyway.  Just turn off PGO on Windows for one nightly
build and see how that affects all our metrics.

I'll grant that's not a proper A/B study, but it'd probably be good enough.

On Fri, Oct 19, 2012 at 9:55 PM, Dave Mandelin  wrote:
> On Thursday, October 18, 2012 4:59:10 AM UTC-7, Ted Mielczarek wrote:
>> If you're interested in the benchmark side of things, it's fairly easy
>> to compare now that we build both PGO and non-PGO builds on a regular
>> basis. I'm having a little trouble getting graphserver to give me recent
>> data, but you can pick arbitrary tests that we run on Talos and graph
>> them side-by-side for the PGO and non-PGO cases. For example, here's Ts
>> and "Tp5 MozAfterPaint" for Windows 7 on both PGO and non-PGO builds
>> (the data ends in February for some reason):
>>
>> http://graphs.mozilla.org/graph.html#tests=[[16,1,12],[115,1,12],[16,94,12],[115,94,12]]&sel=none&displayrange=365&datatype=running
>>
>> You can see that there's a pretty solid 10-20% advantage to PGO in these
>> tests.
>
> Ah. That answers my question about more data.
>
> For Ts, I see a difference of only 70ms (e.g., 520-590 at the last point). 
> That's borderline trivial, but the differences I measure are much greater. 
> What does Ts actually measure, anyway? Is it measuring only from main() 
> starting to first paint, or something like that?
>
> For Tp5, I see a difference of 80ms (330-410 and such). I'm not really sure 
> what to make of that. By itself, it doesn't necessarily seem like it would 
> that noticeable, but the fraction is big enough that if it holds up for 
> longer and bigger pages, I could see it slightly improving pageloads and 
> probably also reducing some pauses for layout and such. From what I 
> understand about Tp5, it's not really measuring modern pageloads (ignores 
> network and isn't focused on popular sites). I wish we had something more 
> representative so we could draw better conclusions (and not just about PGO).
>
>> Here's Dromaeo (DOM) which displays a similar 20% advantage:
>>
>> http://graphs.mozilla.org/graph.html#tests=[[73,94,12],[73,1,12]]&sel=none&displayrange=365&datatype=running
>>
>> It's certainly hard to draw a conclusion about your hypothesis from just
>> benchmarks, but when almost all of our benchmarks display 10-20%
>> reductions on PGO builds it seems fair to say that that's likely to be
>> user-visible.
>
> It seems fair to me to say that core browser CPU-bound tasks are likely to be 
> 10-20% faster. There is probably some of that users can notice, although I'm 
> not sure exactly what it would be. The JS benchmarks do show faster in the 
> two builds, but I haven't tested other JS-based things to see if it's 
> noticeable. I guess I should be testing game framerates or something like 
> that too.
>
>> We've spent hundreds of man-hours for perf gains far less than that.
>
> Yes, we need to get more judicious about how we apply our perf efforts. :-)
>
>> On a related note, Will Lachance has been tasked with getting our
>> Eideticker performance measurement framework working with Windows, so we
>> should be able to experimentally measure user-visible responsiveness in
>> the near future.
>
> I'm curious to see what kinds of tests it will enable.
>
> Dave
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Benefits of PGO on Windows

2012-10-19 Thread Dave Mandelin
On Thursday, October 18, 2012 4:59:10 AM UTC-7, Ted Mielczarek wrote:
> If you're interested in the benchmark side of things, it's fairly easy 
> to compare now that we build both PGO and non-PGO builds on a regular 
> basis. I'm having a little trouble getting graphserver to give me recent 
> data, but you can pick arbitrary tests that we run on Talos and graph 
> them side-by-side for the PGO and non-PGO cases. For example, here's Ts 
> and "Tp5 MozAfterPaint" for Windows 7 on both PGO and non-PGO builds 
> (the data ends in February for some reason):
> 
> http://graphs.mozilla.org/graph.html#tests=[[16,1,12],[115,1,12],[16,94,12],[115,94,12]]&sel=none&displayrange=365&datatype=running
> 
> You can see that there's a pretty solid 10-20% advantage to PGO in these 
> tests.

Ah. That answers my question about more data.

For Ts, I see a difference of only 70ms (e.g., 520-590 at the last point). 
That's borderline trivial, but the differences I measure are much greater. What 
does Ts actually measure, anyway? Is it measuring only from main() starting to 
first paint, or something like that?

For Tp5, I see a difference of 80ms (330-410 and such). I'm not really sure 
what to make of that. By itself, it doesn't necessarily seem like it would that 
noticeable, but the fraction is big enough that if it holds up for longer and 
bigger pages, I could see it slightly improving pageloads and probably also 
reducing some pauses for layout and such. From what I understand about Tp5, 
it's not really measuring modern pageloads (ignores network and isn't focused 
on popular sites). I wish we had something more representative so we could draw 
better conclusions (and not just about PGO).

> Here's Dromaeo (DOM) which displays a similar 20% advantage:
> 
> http://graphs.mozilla.org/graph.html#tests=[[73,94,12],[73,1,12]]&sel=none&displayrange=365&datatype=running
> 
> It's certainly hard to draw a conclusion about your hypothesis from just 
> benchmarks, but when almost all of our benchmarks display 10-20% 
> reductions on PGO builds it seems fair to say that that's likely to be 
> user-visible. 

It seems fair to me to say that core browser CPU-bound tasks are likely to be 
10-20% faster. There is probably some of that users can notice, although I'm 
not sure exactly what it would be. The JS benchmarks do show faster in the two 
builds, but I haven't tested other JS-based things to see if it's noticeable. I 
guess I should be testing game framerates or something like that too.

> We've spent hundreds of man-hours for perf gains far less than that.

Yes, we need to get more judicious about how we apply our perf efforts. :-)

> On a related note, Will Lachance has been tasked with getting our 
> Eideticker performance measurement framework working with Windows, so we 
> should be able to experimentally measure user-visible responsiveness in 
> the near future.

I'm curious to see what kinds of tests it will enable.

Dave
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Benefits of PGO on Windows

2012-10-19 Thread Dave Mandelin
On Wednesday, October 17, 2012 11:00:13 PM UTC-7, Mike Hommey wrote:
> If you copy omni.ja from the PGO build to the opt build, you'll be able
> to see if everything comes from that. We're planning to make that
> currently PGO-only optimization run on all builds. (bug 773171)

Excellent suggestion, plus it made me repeat the experiment. The repeat turned 
up somewhat more confusing data that still seems to support PGO for Windows 
startup. I did 2-3 tests with each of 4 configurations (I botched one trial and 
didn't bother rebooting to test it again), and got this:

 pgo with pgo omni.ja1.6 - 1.7 seconds (1.6, 1.7)
 pgo with opt omni.ja1.4 - 1.6 seconds (1.4, 1.6, 1.6)
 opt with pgo omni.ja1.3 - 8.0 seconds (1.3, 1.4, 8.0)
 opt with opt omni.ja2.9 - 8.7 seconds (2.9, 6.2, 8.7)

The number of trials is too small to conclude very much. If we really wanted to 
know, either someone would have to spend some time doing this over and over, or 
we'd have to use Telemetry with some A/B testing.

It's very weird to me that despite the new weirdness, opt/opt was always slower 
than pgo/pgo, and by about the same amount as my first experiment (in the best 
case for opt/opt). 

Dave
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Easiest way to start using httpd.js?

2012-10-19 Thread Jim

On 10/17/2012 06:27 PM, Gregory Szorc wrote:

I wish it were easier. If this pattern ever becomes prevalent, we should
probably move the Python + JS harness to somewhere more central. File a
bug against me and I'll happily refactor the code in services/ to be
more generic and shareable.


Thanks for the extremely thorough information! I've finally gotten an 
(extremely basic) server running via httpd.js, so now comes the fun of 
implementing the actual server logic.


- Jim
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Gfx meeting, Monday 2:30 PM US/Pacific

2012-10-19 Thread Benoit Jacob
Hello,

The Graphics meeting will take place this Monday at 2:30 PM US/Pacific
time. That could be Tuesday in your timezone.

Please first add your agenda items there:

https://wiki.mozilla.org/Platform/GFX/2012-October-22

* Not every Monday at 2:30 PM Pacific Time
* +1 650 903 0800 x92 Conf# 99366
* +1 416 848 3114 x92 Conf# 99366
* +1 800 707 2533 (pin 369) Conf# 99366 (toll free, Skype)
* Video (Vidyo) link:
https://v.mozilla.com/flex.html?roomdirect.html&key=vu1FKlkBlT29
* Vidyo room 9366 (if you have LDAP and can log in at https://v.mozilla.com)

See you,
Benoit
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Minimum Required Python Version

2012-10-19 Thread Gregory Szorc
Inbound now requires Python 2.6: 
https://hg.mozilla.org/integration/mozilla-inbound/rev/09dc2dc1fc9f


Expect this to hit central in the next day or two.

If you find a distro that doesn't ship Python 2.6, we have a tool in the 
tree to perform system bootstrapping (python/mozboot). We should teach 
it how to install Python (preferably 2.7). File bugs for this tool under 
Core :: Build Config.


On 10/11/12 4:21 PM, Gregory Szorc wrote:

The general consensus seems to be "2.7 is good," so I filed
https://bugzilla.mozilla.org/show_bug.cgi?id=800614 to have configure
enforce Python 2.6 as the minimum required to *build* the tree. Note
that building is different from running tests (some test runners still
run on Python 2.5 and Talos is on Python 2.4).

We'll figure out the bump to 2.7 once we see how well 2.6 goes.

If anyone has new objections, please state them on the bug.

On 9/9/12 12:54 PM, Gregory Szorc wrote:

The subject of which version of Python to require to build the tree came
up in bug 784841.

We currently require Python >= 2.5 but <3 to build the tree. The main
reason for the 2.5 requirement is the Linux build slaves still run
Python 2.5. Those of us who code Python for the tree have long wanted to
require at least 2.6 because 2.5 is missing many desired features. And,
since 2.6+ is ubiquitous these days, people (sometimes unknowingly)
target it (because it's what's installed locally) and then have to go
through trouble (or tree breakage) to backport compatibility to 2.5.
Personally, I feel that targeting 2.5 is extremely painful (especially
once you've used 2.6+) that I sometimes get discouraged from landing new
features to the build system or test suites because I don't want to deal
with 2.5 compatibility.

I'm pretty sure that no reasonably sized faction will have complaints
about bumping up the minimum version to 2.6. So, the question becomes
whether we should jump all the way to 2.7.

I believe we should.

Taking the long view, we will eventually need to switch to Python 3. Our
migration to Python 3 will likely involve porting all the code to
simultaneously run on both 2.x and 3.x. Python 2.7 has more backported
features from Python 3 than Python 2.6, so ensuring dual compatibility
while employing useful and convenient newer features [1] should be
easier with 2.7.

Shorter term, 2.7 is the superior Python release. There are literally
dozens of bug fixes and minor newer features. Individually, these don't
seem like much. Cumulatively, they represent a lot of saved time and
pain.

Objections to requiring 2.7 will likely be about it not being installed
everywhere out of the box. Let's examine that.

MozillaBuild has shipped with Python 2.7 since November 2011. So,
Windows is taken care of.

OS X 10.7+ ship with Python 2.7. No action necessary. OS X 10.6 ships
with 2.6. However, 2.7 is easy to install through Homebrew, MacPorts, or
an official installer available through python.org. I believe the same
is true for 10.5. I don't consider this to be a hurdle on OS X,
especially since we already require similar steps for other required
packages there.

Linux distros are all over the map. Many include 2.7 as part of the
standard distribution. If they don't, they often include a "python27"
package. Or, at least it is a popular enough package that someone on the
internets provides an RPM, .deb, etc. We would just need to point people
at those in the build instructions on MDN.

In the worst case, you will need to compile Python from source. This is
literally |./configure && make && make install|. Not difficult if you
ask me.

Now, for those who need them (and that number goes down with time as 2.7
becomes more prevalent than 2.6), these will be extra steps. And, every
extra step makes getting started for first-time developers a little
harder. In the grand scheme of all the steps required to build the tree
today, I don't think it's such a big deal. Besides, work is currently
being done to enable one-line system bootstrap to help people initially
configure their systems [2]. Once landed, concerns about setting up
systems to build the tree should be rendered irrelevant for supported
platforms.

Some may say "why not go all the way and require Python 3?" Well,
"require" is a strong word. In my opinion we need to "support" it first.
This is because we almost certainly want to avoid a flag day conversion
because it would be a huge headache for releng and everyone else. This
means a period where we simultaneously support 2.x and 3.x. Once we have
dual support, then we can talk about requiring 3.x.

So, 2.6 or 2.7?

[1] http://docs.python.org/release/2.7.3/whatsnew/2.7.html
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=774112


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.

Re: New backout policy for Ts regressions on mozilla-inbound

2012-10-19 Thread Ehsan Akhgari

On 2012-10-19 12:13 PM, Dao wrote:

On 18.10.2012 20:05, Ehsan Akhgari wrote:

If your patch falls in a range which
causes more than 4% Ts regression, it will be backed out by our sheriffs
together with the rest of the patches in that range, and you can only
reland after you fix the regression by testing locally or on the try
server.


The last point seems excessive. If multiple patches are in the range and
you think yours is likely innocent, I think you should be allowed to
re-land. Worst case is you'll get backed out a second time, but this
seems like it's going to have less time overhead than manually messing
with talos.


Sure, we still allow good judgement, so if you believe your patch was 
innocent, then obviously there is no regression for you to fix, so you 
can just reland.


And if Talos proves that your patch is indeed at fault, you will get 
backed out again!  ;-)


Ehsan

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Downtime notice: Sunday 21oct2012, 8am-6pm PDT

2012-10-19 Thread John O'Duinn

hi;

We need to close the trees for 10hours on Sunday 21oct2012, from 8am-6pm 
PDT, in order to do network upgrades in MTV and SCL1 datacenters. This 
downtime is to allow for 8 hours of Ops work, followed by 2 hours of 
RelEng spinning systems back up. The scl1 core upgrade is needed in 
preparation for the mobile test build out (BUG#799922), but that project 
gained momentum faster than we had planned to get the core online. The 
MTV upgrades are to help fix some recent treeclosing network stability 
issues.



Summary of work planned:
* Bug 800113 - Relocate df302 switch stack
* Bug 795955 - please investigate UPS outage longevity for mtv1
* Bug 798126 - Move ports in 3MDF from Mozilla Guest network to Mozilla 
network

* Bug 797587 - Upgrade SCL1 core network

Sorry for the short notice on this, but this work has been on hold for a 
couple of weeks while we've been dealing with chemspills recently, and 
this gap in the schedule just opened up.



If there are any questions / concerns about this work, please contact me 
or email rele...@mozilla.com.



thanks
John.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: New backout policy for Ts regressions on mozilla-inbound

2012-10-19 Thread Justin Lebar
>> If your patch falls in a range which
>> causes more than 4% Ts regression, it will be backed out by our sheriffs
>> together with the rest of the patches in that range, and you can only
>> reland after you fix the regression by testing locally or on the try
>> server.

Our tools for comparing talos results are in a pitiful state.  I know
that improving them is part of the SfN effort, but I want to emphasize
that if we're backing out patches for Ts regressions now, we need
these tools yesterday.

-Justin

On Fri, Oct 19, 2012 at 12:13 PM, Dao  wrote:
> On 18.10.2012 20:05, Ehsan Akhgari wrote:
>>
>> If your patch falls in a range which
>> causes more than 4% Ts regression, it will be backed out by our sheriffs
>> together with the rest of the patches in that range, and you can only
>> reland after you fix the regression by testing locally or on the try
>> server.
>
>
> The last point seems excessive. If multiple patches are in the range and you
> think yours is likely innocent, I think you should be allowed to re-land.
> Worst case is you'll get backed out a second time, but this seems like it's
> going to have less time overhead than manually messing with talos.
>
>
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: New backout policy for Ts regressions on mozilla-inbound

2012-10-19 Thread Dao

On 18.10.2012 20:05, Ehsan Akhgari wrote:

If your patch falls in a range which
causes more than 4% Ts regression, it will be backed out by our sheriffs
together with the rest of the patches in that range, and you can only
reland after you fix the regression by testing locally or on the try server.


The last point seems excessive. If multiple patches are in the range and 
you think yours is likely innocent, I think you should be allowed to 
re-land. Worst case is you'll get backed out a second time, but this 
seems like it's going to have less time overhead than manually messing 
with talos.


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: New backout policy for Ts regressions on mozilla-inbound

2012-10-19 Thread Chris AtLee
On 18/10/12 06:44 PM, Justin Lebar wrote:
>> Do we still have the bug where a test that finishes first, but is from a
>> later cset (say a later cset IMPROVES Ts by 4% or more) would make us
>> think we regressed it on an earlier cset if that earlier talos run
>> finishes later?
>>
>> Such that we set graph points by the time the test finished, not time
>> the push was, etc.
> 
> https://bugzilla.mozilla.org/show_bug.cgi?id=688534

That applies to the rendering of the graphs on graphs.m.o only. The
regression detection uses the push time to order the results.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: New backout policy for Ts regressions on mozilla-inbound

2012-10-19 Thread Ehsan Akhgari

On 2012-10-19 11:39 AM, Armen Zambrano G. wrote:

Is there a place where this will be document?
I would like to keep an eye on what gets added or point people at it.


Not that I know of!  Please feel free to start a wiki page or something. 
 ;-)


Ehsan

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: New backout policy for Ts regressions on mozilla-inbound

2012-10-19 Thread Armen Zambrano G.

Is there a place where this will be document?
I would like to keep an eye on what gets added or point people at it.

This is awesome!
Looking forward to see how it goes.

thanks Ehsan

On 2012-10-18 2:05 PM, Ehsan Akhgari wrote:

Hi everyone,

As part of our efforts to get more value out of the Talos test suite for
preventing performance regressions, we believe that we are now ready to put
a first set of measures against startup time regressions.  We will start by
imposing a new backout policy for mozilla-inbound checkins for regressions
more than 4% on any given platform.  If your patch falls in a range which
causes more than 4% Ts regression, it will be backed out by our sheriffs
together with the rest of the patches in that range, and you can only
reland after you fix the regression by testing locally or on the try server.

The 4% threshold has been chosen based on anecdotal evidence on the most
recent Ts regressions that we have seen, and is too generous, but we will
be working to improve the reporting and regression detection systems
better, and as those get improved, we would feel more comfortable with
imposing this policy on other Talos tests with tighter thresholds.

Please let me know if you have any questions.

Cheers,
--
Ehsan




___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform