Re: Increase in mozilla-inbound bustage due to people not using Try

Ehsan Akhgari Wed, 15 Aug 2012 18:16:15 -0700

On 12-08-15 6:17 PM, William Lachance wrote:

On 08/14/2012 03:47 PM, Gregory Szorc wrote:

On 8/14/12 12:14 PM, Ed Morley wrote:

On Thursday, 9 August 2012 15:35:28 UTC+1, Justin Lebar  wrote:

Is there a plan to mitigate the coalescing on m-i?  It seems like that
is a big part of the problem.


Reducing the amount of coalescing permitted would just mean we end up
with a backlog of pending tests on the repo tip - which would result
in tree closures regardless. So other than bug 690672 making sheriffs'
lives easier, we just need more machines in the test pool - since it's
simply a case of demand exceeding capacity.

The situation is made worse now that we're adding new platforms (OS X
10.7, B2G GB, B2G ICS, Android Armv6, soon OS X 10.8, Win8 desktop,
Win8 metro) faster than we're EOLing them - and we're pushing more
changes per day than ever before [1]. From what I understand, Apple's
aggressive hardware cycle is also making it difficult to expand the
test pool [2].


Is there a tracking bug for areas where we could gain efficiency? We all
know the build phase is full of clownshoes. But, I believe we also do
silly things like execute some tests serially, only taking advantage of
1/N CPU cores in the process. This is just wasting resources. See [1]
for a concrete example.


Last year we had a buildfaster project to try and improve our end-to-end
build/test times:

https://wiki.mozilla.org/ReleaseEngineering/BuildFaster

I think it's been recently reactivated, I believe mostly with the
intention of working on build times (which is important, but only one
small part of the overall picture):

http://coop.deadsquid.com/2012/07/reviving-buildfaster-fixing-makefiles/

In general I would be very careful before tackling any particular bug
for the sake of improving our build/test times. If something is slow,
but not on the critical path as far as build/test is concerned, fixing
it will not result in any tangible improvement.

When I was working on this project last year, I designed a build charts
view to help visualize which parts were taking the longest (you can see
implicit dependencies between build/test tasks by seeing when certain
jobs run), which proved very helpful to determine which areas we needed
to optimize:

http://brasstacks.mozilla.com/gofaster/#/buildcharts

I'm not sure if the data feeding into that is still valid (some things
like look suspiciously low, and at the very least it doesn't seem
completely up to date). Anyway, if I were going to look into this again
(don't have time right now unfortunately), I would first spend a lot of
time staring at data. :)

This looks great William. But looking at how our load has been for thepast few weeks, I think we're not going to benefit a lot by incrementalimprovements to end-to-end times.

Honestly, the only big thing that we can probably fix to improve ourend-to-end times is to enable using pymake on our Windows builders to doparallel builds. Developers on Windows have been using pymake to getparallel builds for quite a while now, and somebody needs to figure outwhat's happening on our build machines which causes us not to be able touse pymake there, and fix it. That should significantly decrease ourWindows build times depending on the number of cores available on ourWindows builders.

Any other low hanging fruits that I can think of are all going to besmall incremental improvements which, although being very nice, stand nochance against the rate at which our load is increasing. Sounfortunately I don't see any way to address the problem that we'refacing in the short term except for adding hardware.


Cheers,
Ehsan
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Re: Increase in mozilla-inbound bustage due to people not using Try

Reply via email to