Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-10-01 Thread Chris AtLee

On 17:26, Tue, 23 Sep, Kyle Huey wrote:

On Tue, Aug 26, 2014 at 8:23 AM, Chris AtLee  wrote:

Just a short note to say that this experiment is now live on
mozilla-inbound.

___
dev-tree-management mailing list
dev-tree-managem...@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tree-management



What was the outcome?


Thanks for the reminder.

The outcome of this experiment was inconclusive.

On the one hand, we know we didn't make anything worse. The skipping 
behaved as expected, and wasn't a burden on sheriffs. We didn't make 
wait times any worse.


On the other hand, it appears as though we improved wait times for the 
target platforms, but the signal there isn't clear due to other 
variables changing (e.g. overall load wasn't directly comparable between 
the two time windows).


We've left the skipping behaviour enabled for the moment, and are 
considering some tweaks to the amount of skipping that happens, and 
which branches/platforms it's enabled for.


Cheers,
Chris


signature.asc
Description: Digital signature
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-09-23 Thread Kyle Huey
On Tue, Aug 26, 2014 at 8:23 AM, Chris AtLee  wrote:
> Just a short note to say that this experiment is now live on
> mozilla-inbound.
>
> ___
> dev-tree-management mailing list
> dev-tree-managem...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-tree-management
>

What was the outcome?

- Kyle
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-26 Thread Chris AtLee
Just a short note to say that this experiment is now live on 
mozilla-inbound.


signature.asc
Description: Digital signature
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-21 Thread Jonas Sicking
On Thu, Aug 21, 2014 at 3:21 PM, Jonathan Griffin  wrote:
> In summary, the sheriffs won't be backing out extra commits because of the
> coalescing, and it remains the sheriffs' job to backfill tests when they
> determine they need to do so in order to bisect a failure.   We aren't
> placing any extra burden on developers with this experiment, and part of the
> reason for this experiment is to determine how much of an extra burden this
> is for the sheriffs.

As long as sheriffs are in support of this (which it sounds like is
the case), then this sounds awesome to me.

/ Jonas
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-21 Thread Mike Hommey
On Thu, Aug 21, 2014 at 03:03:30PM -0700, Jonas Sicking wrote:
> What will be the policy if a test fails and it's unclear which push
> caused the regression?

You may have missed the main point that it's not "What will", but "What
is". It *is* already the case that tests are skipped.

Mike
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-21 Thread Jonathan Griffin
It will be handled just like coalesced jobs today:  sheriffs will 
backfill the missing data, and backout only the offender.


An illustration might help.  Today we might have something like this, 
for a given job:


 linux64-debug  win7-debug  osx8-debug
commit 1 pass  pass   pass
commit 2 pass  pass   pass
commit 3 pass  fail   pass
commit 4 pass  fail   pass

In this case (assuming the two failures are the same), it's easy for 
sheriffs to see that commit 3 is the culprit and the one that needs to 
be backed out.


During the experiment, we might see something like this:

 linux64-debug  win7-debug  osx8-debug
commit 1 pass  pass   pass
commit 2 pass  not runnot run
commit 3 pass  fail   pass
commit 4 pass  not runnot run

Here, it isn't obvious whether the problem is caused by commit 2 or 
commit 3.  (This situation already occurs today because of "random" 
coalescing.)


In this case, the sheriffs will backfill missing test data, so we might see:

 linux64-debug  win7-debug  osx8-debug
commit 1 pass  pass   pass
commit 2 pass  pass   not run
commit 3 pass  fail   pass
commit 4 pass  fail   not run

...and then they have enough data to determine that commit 3 (and not 
commit 2) is to blame, and can take the appropriate action.


In summary, the sheriffs won't be backing out extra commits because of 
the coalescing, and it remains the sheriffs' job to backfill tests when 
they determine they need to do so in order to bisect a failure.   We 
aren't placing any extra burden on developers with this experiment, and 
part of the reason for this experiment is to determine how much of an 
extra burden this is for the sheriffs.


Jonathan

On 8/21/2014 3:03 PM, Jonas Sicking wrote:

What will be the policy if a test fails and it's unclear which push
caused the regression? Is it the sheriff's job, or the people who
pushed's job to figure out which push was the culprit and make sure
that that push gets backed out?

I.e. if 4 pushes land between two testruns, and we see a regression,
will the 4 pushes be backed out? Or will sheriffs run the missing
tests and only back out the offending push?

/ Jonas

On Thu, Aug 21, 2014 at 10:50 AM, Jonathan Griffin  wrote:

Thanks Ed.  To paraphrase, no test coverage is being lost here, we're just
being a little more deliberate with job coalescing.  All tests will be run
on all platforms (including debug tests) on a commit before a merge to m-c.

Jonathan


On 8/21/2014 9:35 AM, Ed Morley wrote:

I think much of the pushback in this thread is due to a misunderstanding
of some combination of:
* our current buildbot scheduling
* the proposal
* how trees are sheriffed and merged

To clarify:

1) We already have coalescing [*] of jobs on all trees apart from try.

2) This coalescing means that all jobs are still run at some point, but
just may not run on every push.

3) When failures are detected, coalescing means that regression ranges are
larger and so sometimes result in longer tree integration repo closures,
whilst the sheriffs force trigger jobs on the revisions that did not
originally run them.

4) When merging into mozilla-central, sheriffs ensure that all jobs are
green - including those that got coalesced and those that are only scheduled
periodically (eg non-unified & PGO builds are only run every 3 hours). (This
is a fairly manual process currently, but better tooling should be possible
with treeherder).

5) This proposal does not mean debug-only issues are somehow not worth
acting on or that they'll end up shipped/on mozilla-central, thanks to #4.

6) This proposal is purely trying to make existing coalescing (#1/#2) more
intelligent, to ensure that we expend the finite amount of machine time we
have at present on the most appropriate jobs at each point, in order to
reduce the impact of #3.

Fwiw I'm on the fence as to whether the algorithm suggested in this
proposal is the most effective way to aid with #3 - however it's worth
trying to find out.

Best wishes,

Ed

[*] Collapsing of pending jobs of the same type, when the queue size is
greater than 1.


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-21 Thread Jonas Sicking
What will be the policy if a test fails and it's unclear which push
caused the regression? Is it the sheriff's job, or the people who
pushed's job to figure out which push was the culprit and make sure
that that push gets backed out?

I.e. if 4 pushes land between two testruns, and we see a regression,
will the 4 pushes be backed out? Or will sheriffs run the missing
tests and only back out the offending push?

/ Jonas

On Thu, Aug 21, 2014 at 10:50 AM, Jonathan Griffin  wrote:
> Thanks Ed.  To paraphrase, no test coverage is being lost here, we're just
> being a little more deliberate with job coalescing.  All tests will be run
> on all platforms (including debug tests) on a commit before a merge to m-c.
>
> Jonathan
>
>
> On 8/21/2014 9:35 AM, Ed Morley wrote:
>>
>> I think much of the pushback in this thread is due to a misunderstanding
>> of some combination of:
>> * our current buildbot scheduling
>> * the proposal
>> * how trees are sheriffed and merged
>>
>> To clarify:
>>
>> 1) We already have coalescing [*] of jobs on all trees apart from try.
>>
>> 2) This coalescing means that all jobs are still run at some point, but
>> just may not run on every push.
>>
>> 3) When failures are detected, coalescing means that regression ranges are
>> larger and so sometimes result in longer tree integration repo closures,
>> whilst the sheriffs force trigger jobs on the revisions that did not
>> originally run them.
>>
>> 4) When merging into mozilla-central, sheriffs ensure that all jobs are
>> green - including those that got coalesced and those that are only scheduled
>> periodically (eg non-unified & PGO builds are only run every 3 hours). (This
>> is a fairly manual process currently, but better tooling should be possible
>> with treeherder).
>>
>> 5) This proposal does not mean debug-only issues are somehow not worth
>> acting on or that they'll end up shipped/on mozilla-central, thanks to #4.
>>
>> 6) This proposal is purely trying to make existing coalescing (#1/#2) more
>> intelligent, to ensure that we expend the finite amount of machine time we
>> have at present on the most appropriate jobs at each point, in order to
>> reduce the impact of #3.
>>
>> Fwiw I'm on the fence as to whether the algorithm suggested in this
>> proposal is the most effective way to aid with #3 - however it's worth
>> trying to find out.
>>
>> Best wishes,
>>
>> Ed
>>
>> [*] Collapsing of pending jobs of the same type, when the queue size is
>> greater than 1.
>
>
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-21 Thread Jonathan Griffin

Hey Martin,

This is a good idea, and we've been thinking about approaches like 
this.  Basically, the idea is to run tests that "(nearly) always pass" 
less often.  There are currently some tests that fit into this category, 
like dom level0,1,2 tests in mochitest-plain, and those are 
time-consuming to run.  Your idea takes this a step further, by 
identifying tests that sometimes fail, correlating those with code 
changes, and ensuring those get run.


Both of these require some tooling to implement, so we're experimenting 
initially with approaches that we can get nearly for "free", like 
running some tests only every other commit, and letting sheriffs trigger 
the missing tests in case a failure occurs.


The ultimate solution may blend a bit of both approaches, and will have 
to balance implementation cost with the gain we get from the related 
reduction in slave load.


Jonathan


On 8/21/2014 10:07 AM, Martin Thomson wrote:

On 20/08/14 17:37, Jonas Sicking wrote:

It would however be really cool if we were able to pull data on which
tests tend to fail in a way that affects all platforms, and which ones
tend to fail on one platform only.


Here's a potential project that might help.  For all of the trees 
(probably try especially), look at the checkins and for each directory 
affected build up a probability of failure for each of the tests.


You would have to find which commits were on m-c at the time of the 
run to set the baseline for the checkin; and intermittent failures 
would add a certain noise floor.


The basic idea though is that the information would be very simple to 
use: For each directory touched in a commit, find all the tests that 
cross a certain failure threshold across the assembled dataset and 
ensure that those test groups are run.


And this would need to include prerequisites, like builds for the 
given runs.  You would, of course, include builds as tests.


Setting the threshold might take some tuning, because failure rates 
will vary across different test groups.  I keep hearing bad things 
about certain ones, for instance and build failures are far less 
common than test failures on the whole, naturally.

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-21 Thread Jonathan Griffin
Thanks Ed.  To paraphrase, no test coverage is being lost here, we're 
just being a little more deliberate with job coalescing.  All tests will 
be run on all platforms (including debug tests) on a commit before a 
merge to m-c.


Jonathan

On 8/21/2014 9:35 AM, Ed Morley wrote:
I think much of the pushback in this thread is due to a 
misunderstanding of some combination of:

* our current buildbot scheduling
* the proposal
* how trees are sheriffed and merged

To clarify:

1) We already have coalescing [*] of jobs on all trees apart from try.

2) This coalescing means that all jobs are still run at some point, 
but just may not run on every push.


3) When failures are detected, coalescing means that regression ranges 
are larger and so sometimes result in longer tree integration repo 
closures, whilst the sheriffs force trigger jobs on the revisions that 
did not originally run them.


4) When merging into mozilla-central, sheriffs ensure that all jobs 
are green - including those that got coalesced and those that are only 
scheduled periodically (eg non-unified & PGO builds are only run every 
3 hours). (This is a fairly manual process currently, but better 
tooling should be possible with treeherder).


5) This proposal does not mean debug-only issues are somehow not worth 
acting on or that they'll end up shipped/on mozilla-central, thanks to 
#4.


6) This proposal is purely trying to make existing coalescing (#1/#2) 
more intelligent, to ensure that we expend the finite amount of 
machine time we have at present on the most appropriate jobs at each 
point, in order to reduce the impact of #3.


Fwiw I'm on the fence as to whether the algorithm suggested in this 
proposal is the most effective way to aid with #3 - however it's worth 
trying to find out.


Best wishes,

Ed

[*] Collapsing of pending jobs of the same type, when the queue size 
is greater than 1.


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-21 Thread Chris Peterson

On 8/21/14 9:35 AM, Ed Morley wrote:

4) When merging into mozilla-central, sheriffs ensure that all jobs are
green - including those that got coalesced and those that are only
scheduled periodically (eg non-unified & PGO builds are only run every 3
hours). (This is a fairly manual process currently, but better tooling
should be possible with treeherder).


To ensure that all code landing in mozilla-central has passed debug 
tests, sheriffs could merge only from the mozilla-inbound changesets 
that ran the debug tests.



chris
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-21 Thread Martin Thomson

On 20/08/14 17:37, Jonas Sicking wrote:

It would however be really cool if we were able to pull data on which
tests tend to fail in a way that affects all platforms, and which ones
tend to fail on one platform only.


Here's a potential project that might help.  For all of the trees 
(probably try especially), look at the checkins and for each directory 
affected build up a probability of failure for each of the tests.


You would have to find which commits were on m-c at the time of the run 
to set the baseline for the checkin; and intermittent failures would add 
a certain noise floor.


The basic idea though is that the information would be very simple to 
use: For each directory touched in a commit, find all the tests that 
cross a certain failure threshold across the assembled dataset and 
ensure that those test groups are run.


And this would need to include prerequisites, like builds for the given 
runs.  You would, of course, include builds as tests.


Setting the threshold might take some tuning, because failure rates will 
vary across different test groups.  I keep hearing bad things about 
certain ones, for instance and build failures are far less common than 
test failures on the whole, naturally.

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-21 Thread Ed Morley
I think much of the pushback in this thread is due to a misunderstanding 
of some combination of:

* our current buildbot scheduling
* the proposal
* how trees are sheriffed and merged

To clarify:

1) We already have coalescing [*] of jobs on all trees apart from try.

2) This coalescing means that all jobs are still run at some point, but 
just may not run on every push.


3) When failures are detected, coalescing means that regression ranges 
are larger and so sometimes result in longer tree integration repo 
closures, whilst the sheriffs force trigger jobs on the revisions that 
did not originally run them.


4) When merging into mozilla-central, sheriffs ensure that all jobs are 
green - including those that got coalesced and those that are only 
scheduled periodically (eg non-unified & PGO builds are only run every 3 
hours). (This is a fairly manual process currently, but better tooling 
should be possible with treeherder).


5) This proposal does not mean debug-only issues are somehow not worth 
acting on or that they'll end up shipped/on mozilla-central, thanks to #4.


6) This proposal is purely trying to make existing coalescing (#1/#2) 
more intelligent, to ensure that we expend the finite amount of machine 
time we have at present on the most appropriate jobs at each point, in 
order to reduce the impact of #3.


Fwiw I'm on the fence as to whether the algorithm suggested in this 
proposal is the most effective way to aid with #3 - however it's worth 
trying to find out.


Best wishes,

Ed

[*] Collapsing of pending jobs of the same type, when the queue size is 
greater than 1.

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-21 Thread Milan Sreckovic

--
- Milan

On Aug 21, 2014, at 10:12 , Chris AtLee  wrote:

> On 17:37, Wed, 20 Aug, Jonas Sicking wrote:
>> On Wed, Aug 20, 2014 at 4:24 PM, Jeff Gilbert  wrote:
>>> I have been asked in the past if we really need to run WebGL tests on 
>>> Android, if they have coverage on Desktop platforms.
>>> And then again later, why B2G if we have Android.
>>> 
>>> There seems to be enough belief in test-once-run-everywhere that I feel the 
>>> need to *firmly* establish that this is not acceptable, at least for the 
>>> code I work with.
>>> I'm happy I'm not alone in this.
>> 
>> I'm a firm believer that we ultimately need to run basically all
>> combinations of tests and platforms before allowing code to reach
>> mozilla-central. There's lots of platform specific code paths, and
>> it's hard to track which tests trigger them, and which don't.
> 
> I think we can agree on this. However, not running all tests on all platforms 
> per push on mozilla-inbound (or other branch) doesn't mean that they won't be 
> run on mozilla-central, or even on mozilla-inbound prior to merging.
> 
> I'm a firm believer that running all tests for all platforms for all pushes 
> is a waste of our infrastructure and human resources.
> 
> I think the gap we need to figure out how to fill is between getting per-push 
> efficiency and full test coverage prior to merging.

The cost of not catching a problem with a test and letting the code land is 
huge.  I only know this for the graphics team, but to Ehsan’s and Jonas’ point, 
I’m sure it’s not specific to graphics.  Now, one is preventative cost (tests), 
one is treatment cost (fixing issues that snuck through), so it’s sometimes 
difficult to compare, and we are not alone in first going after the 
preventative costs, but it’s a big mistake to do so.

Now, if we need to save some electricity or cash, I understand that as well, 
and it eventually translates to the cost to the company the same as people’s 
time.  If we can do something by skipping every n-th debug run, sure, let’s try 
it.  We have to make sure that a failure on a debug test run triggers us going 
back and re-running the skipped ones, so that we don’t have any gaps in the 
tests where something may have gone wrong.


> 
>> It would however be really cool if we were able to pull data on which
>> tests tend to fail in a way that affects all platforms, and which ones
>> tend to fail on one platform only. If we combine this with the ability
>> of having tbpl (or treeherder) "fill in the blanks" whenever a test
>> fails, it seems like we could run many of our tests only one one
>> platform for most checkins to mozilla-inbound.
> 
> There are dozens of really interesting approaches we could take here.
> Skipping every nth debug test run is one of the simplest, and I hope we can 
> learn a lot from the experiment.
> 
> Cheers,
> Chris
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-21 Thread Chris AtLee

On 17:37, Wed, 20 Aug, Jonas Sicking wrote:

On Wed, Aug 20, 2014 at 4:24 PM, Jeff Gilbert  wrote:

I have been asked in the past if we really need to run WebGL tests on Android, 
if they have coverage on Desktop platforms.
And then again later, why B2G if we have Android.

There seems to be enough belief in test-once-run-everywhere that I feel the 
need to *firmly* establish that this is not acceptable, at least for the code I 
work with.
I'm happy I'm not alone in this.


I'm a firm believer that we ultimately need to run basically all
combinations of tests and platforms before allowing code to reach
mozilla-central. There's lots of platform specific code paths, and
it's hard to track which tests trigger them, and which don't.


I think we can agree on this. However, not running all tests on all 
platforms per push on mozilla-inbound (or other branch) doesn't mean 
that they won't be run on mozilla-central, or even on mozilla-inbound 
prior to merging.


I'm a firm believer that running all tests for all platforms for all 
pushes is a waste of our infrastructure and human resources.


I think the gap we need to figure out how to fill is between getting 
per-push efficiency and full test coverage prior to merging.



It would however be really cool if we were able to pull data on which
tests tend to fail in a way that affects all platforms, and which ones
tend to fail on one platform only. If we combine this with the ability
of having tbpl (or treeherder) "fill in the blanks" whenever a test
fails, it seems like we could run many of our tests only one one
platform for most checkins to mozilla-inbound.


There are dozens of really interesting approaches we could take here.
Skipping every nth debug test run is one of the simplest, and I hope we 
can learn a lot from the experiment.


Cheers,
Chris


signature.asc
Description: Digital signature
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-20 Thread Jonas Sicking
On Wed, Aug 20, 2014 at 4:24 PM, Jeff Gilbert  wrote:
> I have been asked in the past if we really need to run WebGL tests on 
> Android, if they have coverage on Desktop platforms.
> And then again later, why B2G if we have Android.
>
> There seems to be enough belief in test-once-run-everywhere that I feel the 
> need to *firmly* establish that this is not acceptable, at least for the code 
> I work with.
> I'm happy I'm not alone in this.

I'm a firm believer that we ultimately need to run basically all
combinations of tests and platforms before allowing code to reach
mozilla-central. There's lots of platform specific code paths, and
it's hard to track which tests trigger them, and which don't.

It would however be really cool if we were able to pull data on which
tests tend to fail in a way that affects all platforms, and which ones
tend to fail on one platform only. If we combine this with the ability
of having tbpl (or treeherder) "fill in the blanks" whenever a test
fails, it seems like we could run many of our tests only one one
platform for most checkins to mozilla-inbound.

/ Jonas
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-20 Thread Jeff Gilbert
> From: "Ehsan Akhgari" 
> To: "Jeff Gilbert" 
> Cc: "Chris AtLee" , "Jonathan Griffin" 
> , dev-platform@lists.mozilla.org
> Sent: Wednesday, August 20, 2014 4:00:15 PM
> Subject: Re: Experiment with running debug tests less often on 
> mozilla-inbound the week of August 25
> 
> On 2014-08-20, 6:29 PM, Jeff Gilbert wrote:
> > If running debug tests on a single platform is generally sufficient for
> > non-graphics bugs,
> 
> It is not.  That is the point I was trying to make.  :-)
> 
>  > it might be useful to have the Graphics branch run debug tests on all
> platforms, for use with graphics checkins. (while running a decreased
> number of debug tests on the main branches) It's still possible for
> non-graphics code to expose platform-specific bugs, but it's less
> likely, so maybe larger regression windows are acceptable for
> platform-specific bugs in non-graphics code.
> 
> I don't really understand how graphics is special here.  We do have
> platform specific code outside of graphics as well, so we don't need to
> solve this problem for gfx specifically.
> 

Maybe Graphics isn't that special, but this stuff hits really close to home for 
us.

I have been asked in the past if we really need to run WebGL tests on Android, 
if they have coverage on Desktop platforms.
And then again later, why B2G if we have Android.

There seems to be enough belief in test-once-run-everywhere that I feel the 
need to *firmly* establish that this is not acceptable, at least for the code I 
work with.
I'm happy I'm not alone in this.

-Jeff
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-20 Thread Ehsan Akhgari

On 2014-08-20, 6:29 PM, Jeff Gilbert wrote:

If running debug tests on a single platform is generally sufficient for 
non-graphics bugs,


It is not.  That is the point I was trying to make.  :-)

> it might be useful to have the Graphics branch run debug tests on all 
platforms, for use with graphics checkins. (while running a decreased 
number of debug tests on the main branches) It's still possible for 
non-graphics code to expose platform-specific bugs, but it's less 
likely, so maybe larger regression windows are acceptable for 
platform-specific bugs in non-graphics code.


I don't really understand how graphics is special here.  We do have 
platform specific code outside of graphics as well, so we don't need to 
solve this problem for gfx specifically.


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-20 Thread Jeff Gilbert
If running debug tests on a single platform is generally sufficient for 
non-graphics bugs, it might be useful to have the Graphics branch run debug 
tests on all platforms, for use with graphics checkins. (while running a 
decreased number of debug tests on the main branches) It's still possible for 
non-graphics code to expose platform-specific bugs, but it's less likely, so 
maybe larger regression windows are acceptable for platform-specific bugs in 
non-graphics code.

-Jeff

- Original Message -
From: "Ehsan Akhgari" 
To: "Jeff Gilbert" , "Chris AtLee" 
Cc: "Jonathan Griffin" , dev-platform@lists.mozilla.org
Sent: Wednesday, August 20, 2014 3:16:31 PM
Subject: Re: Experiment with running debug tests less often on mozilla-inbound 
the week of August 25

On 2014-08-20, 5:46 PM, Jeff Gilbert wrote:
> Graphics in particular is plagued by non-cross-platform code. Debug coverage 
> on Linux gives us no practical coverage for our windows, mac, android, or b2g 
> code. Maybe this is better solved with reviving the Graphics branch, however.

Having more branches doesn't necessarily help with consuming less infra 
resources, unless if the builds will be run with a lower frequency or 
something.

Cheers,
Ehsan

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-20 Thread Ehsan Akhgari

On 2014-08-20, 5:46 PM, Jeff Gilbert wrote:

Graphics in particular is plagued by non-cross-platform code. Debug coverage on 
Linux gives us no practical coverage for our windows, mac, android, or b2g 
code. Maybe this is better solved with reviving the Graphics branch, however.


Having more branches doesn't necessarily help with consuming less infra 
resources, unless if the builds will be run with a lower frequency or 
something.


Cheers,
Ehsan

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-20 Thread Jeff Gilbert
Graphics in particular is plagued by non-cross-platform code. Debug coverage on 
Linux gives us no practical coverage for our windows, mac, android, or b2g 
code. Maybe this is better solved with reviving the Graphics branch, however.

-Jeff

- Original Message -
From: "Chris AtLee" 
To: "Ehsan Akhgari" 
Cc: "Jonathan Griffin" , "Jeff Gilbert" 
, dev-platform@lists.mozilla.org
Sent: Wednesday, August 20, 2014 9:02:14 AM
Subject: Re: Experiment with running debug tests less often on mozilla-inbound 
the week of August 25

On 18:25, Tue, 19 Aug, Ehsan Akhgari wrote:
>On 2014-08-19, 5:49 PM, Jonathan Griffin wrote:
>>On 8/19/2014 2:41 PM, Ehsan Akhgari wrote:
>>>On 2014-08-19, 3:57 PM, Jeff Gilbert wrote:
>>>>I would actually say that debug tests are more important for
>>>>continuous integration than opt tests. At least in code I deal with,
>>>>we have a ton of asserts to guarantee behavior, and we really want
>>>>test coverage with these via CI. If a test passes on debug, it should
>>>>almost certainly pass on opt, just faster. The opposite is not true.
>>>>
>>>>"They take a long time and then break" is part of what I believe
>>>>caused us to not bother with debug testing on much of Android and
>>>>B2G, which we still haven't completely fixed. It should be
>>>>unacceptable to ship without CI on debug tests, but here we are
>>>>anyways. (This is finally nearly fixed, though there is still some
>>>>work to do)
>>>>
>>>>I'm not saying running debug tests less often is on the same scale of
>>>>bad, but I would like to express my concerns about heading in that
>>>>direction.
>>>
>>>I second this.  I'm curious to know why you picked debug tests for
>>>this experiment.  Would it not make more sense to run opt tests on
>>>desktop platforms on every other run?
>>>
>>Just based on the fact that they take longer and thus running them less
>>frequently would have a larger impact.  If there's a broad consensus
>>that debug runs are more valuable, we could switch to running opt tests
>>less frequently instead.
>
>Yep, the debug tests indeed take more time, mostly because they run 
>more checks.  :-)  The checks in opt builds are not exactly a subset 
>of the ones in debug builds, but they are close.  Based on that, I 
>think running opt tests on every other push is a more conservative 
>one, and I support it more.  That being said, for this one week 
>limited trial, given that the sheriffs will help backfill the skipped 
>tests, I don't care very strongly about this, as long as it doesn't 
>set the precedence that we can ignore debug tests!

I'd like to highlight that we're still planning on running debug 
linux64 tests for every build. This is based on the assumption that 
debug-specific failures are generally cross-platform failures as well.

Does this help alleviate some concern? Or is that assumption just plain 
wrong?

Cheers,
Chris
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-20 Thread Mike Hommey
On Wed, Aug 20, 2014 at 03:58:55PM +0100, Ed Morley wrote:
> On 19/08/2014 21:55, Benoit Girard wrote:
> >I completely agree with Jeff Gilbert on this one.
> >
> >I think we should try to coalesce -better-. I just checked the current
> >state of mozilla-inbound and it doesn't feel any of the current patch
> >really need their own set of tests because they're are not time
> >sensitive or sufficiently complex. Right now developers are asked to
> >create bugs for their own change with their own patch. This leads to a
> >lot of little patches being landed by individual developers which
> >seems to reflect the current state of mozilla-inbound.
> >
> >Perhaps we should instead promote checkin-needed (or a similar simple)
> >to coalesce simple changes together. Opting into this means that your
> >patch may take significantly longer to get merged if it's landed with
> >another bad patch and should only be used when that's acceptable.
> >Right now developers with commit access are not encouraged to make use
> >of checkin-needed AFAIK. If we started recommending against individual
> >landings for simple changes, and improved the process, we could
> >probably significantly cut the number of tests jobs by cutting the
> >number of pushes.
> 
> I agree we should try to coalesce better - however doing this via a manual
> "let's get someone to push a bunch of checkin-needed patches in one go" is
> suboptimal:
> 1) By tweaking coalescing in buildbot & pushing patches individually, we
> could get the same build+test job per commit ratio as doing checkin-neededs,
> but with the bonus of being able to backfill jobs where needed. This isn't
> possible when say 10-20 checkin-neededs are landed in one push, since our
> tooling can only trigger (and more importantly display the results of) jobs
> on a per push level.

It would have been useful on several occasions to be able to trigger
builds at changeset level instead of push level, independently of
checkin-needed.

Mike
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-20 Thread Ehsan Akhgari

On 2014-08-20, 12:02 PM, Chris AtLee wrote:

On 18:25, Tue, 19 Aug, Ehsan Akhgari wrote:

On 2014-08-19, 5:49 PM, Jonathan Griffin wrote:

On 8/19/2014 2:41 PM, Ehsan Akhgari wrote:

On 2014-08-19, 3:57 PM, Jeff Gilbert wrote:

I would actually say that debug tests are more important for
continuous integration than opt tests. At least in code I deal with,
we have a ton of asserts to guarantee behavior, and we really want
test coverage with these via CI. If a test passes on debug, it should
almost certainly pass on opt, just faster. The opposite is not true.

"They take a long time and then break" is part of what I believe
caused us to not bother with debug testing on much of Android and
B2G, which we still haven't completely fixed. It should be
unacceptable to ship without CI on debug tests, but here we are
anyways. (This is finally nearly fixed, though there is still some
work to do)

I'm not saying running debug tests less often is on the same scale of
bad, but I would like to express my concerns about heading in that
direction.


I second this.  I'm curious to know why you picked debug tests for
this experiment.  Would it not make more sense to run opt tests on
desktop platforms on every other run?


Just based on the fact that they take longer and thus running them less
frequently would have a larger impact.  If there's a broad consensus
that debug runs are more valuable, we could switch to running opt tests
less frequently instead.


Yep, the debug tests indeed take more time, mostly because they run
more checks.  :-)  The checks in opt builds are not exactly a subset
of the ones in debug builds, but they are close.  Based on that, I
think running opt tests on every other push is a more conservative
one, and I support it more.  That being said, for this one week
limited trial, given that the sheriffs will help backfill the skipped
tests, I don't care very strongly about this, as long as it doesn't
set the precedence that we can ignore debug tests!


I'd like to highlight that we're still planning on running debug linux64
tests for every build. This is based on the assumption that
debug-specific failures are generally cross-platform failures as well.

Does this help alleviate some concern? Or is that assumption just plain
wrong?


well, yes, most of our code is cross platform, but there are debug only 
checks in our platform specific code as well, so if we're talking about 
something more permanent than that week long experiment, then running 
debug tests on Linux64 doesn't alleviate all concerns.  But it's fine 
for this short experiment.


Cheers,
Ehsan

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-20 Thread Chris AtLee

On 18:25, Tue, 19 Aug, Ehsan Akhgari wrote:

On 2014-08-19, 5:49 PM, Jonathan Griffin wrote:

On 8/19/2014 2:41 PM, Ehsan Akhgari wrote:

On 2014-08-19, 3:57 PM, Jeff Gilbert wrote:

I would actually say that debug tests are more important for
continuous integration than opt tests. At least in code I deal with,
we have a ton of asserts to guarantee behavior, and we really want
test coverage with these via CI. If a test passes on debug, it should
almost certainly pass on opt, just faster. The opposite is not true.

"They take a long time and then break" is part of what I believe
caused us to not bother with debug testing on much of Android and
B2G, which we still haven't completely fixed. It should be
unacceptable to ship without CI on debug tests, but here we are
anyways. (This is finally nearly fixed, though there is still some
work to do)

I'm not saying running debug tests less often is on the same scale of
bad, but I would like to express my concerns about heading in that
direction.


I second this.  I'm curious to know why you picked debug tests for
this experiment.  Would it not make more sense to run opt tests on
desktop platforms on every other run?


Just based on the fact that they take longer and thus running them less
frequently would have a larger impact.  If there's a broad consensus
that debug runs are more valuable, we could switch to running opt tests
less frequently instead.


Yep, the debug tests indeed take more time, mostly because they run 
more checks.  :-)  The checks in opt builds are not exactly a subset 
of the ones in debug builds, but they are close.  Based on that, I 
think running opt tests on every other push is a more conservative 
one, and I support it more.  That being said, for this one week 
limited trial, given that the sheriffs will help backfill the skipped 
tests, I don't care very strongly about this, as long as it doesn't 
set the precedence that we can ignore debug tests!


I'd like to highlight that we're still planning on running debug 
linux64 tests for every build. This is based on the assumption that 
debug-specific failures are generally cross-platform failures as well.


Does this help alleviate some concern? Or is that assumption just plain 
wrong?


Cheers,
Chris


signature.asc
Description: Digital signature
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-20 Thread Ben Hearsum
On 14-08-20 11:48 AM, Ehsan Akhgari wrote:
> On 2014-08-20, 10:58 AM, Ed Morley wrote:
>> On 19/08/2014 21:55, Benoit Girard wrote:
>>> I completely agree with Jeff Gilbert on this one.
>>>
>>> I think we should try to coalesce -better-. I just checked the current
>>> state of mozilla-inbound and it doesn't feel any of the current patch
>>> really need their own set of tests because they're are not time
>>> sensitive or sufficiently complex. Right now developers are asked to
>>> create bugs for their own change with their own patch. This leads to a
>>> lot of little patches being landed by individual developers which
>>> seems to reflect the current state of mozilla-inbound.
>>>
>>> Perhaps we should instead promote checkin-needed (or a similar simple)
>>> to coalesce simple changes together. Opting into this means that your
>>> patch may take significantly longer to get merged if it's landed with
>>> another bad patch and should only be used when that's acceptable.
>>> Right now developers with commit access are not encouraged to make use
>>> of checkin-needed AFAIK. If we started recommending against individual
>>> landings for simple changes, and improved the process, we could
>>> probably significantly cut the number of tests jobs by cutting the
>>> number of pushes.
>>
>> I agree we should try to coalesce better - however doing this via a
>> manual "let's get someone to push a bunch of checkin-needed patches in
>> one go" is suboptimal:
>> 1) By tweaking coalescing in buildbot & pushing patches individually, we
>> could get the same build+test job per commit ratio as doing
>> checkin-neededs, but with the bonus of being able to backfill jobs where
>> needed. This isn't possible when say 10-20 checkin-neededs are landed in
>> one push, since our tooling can only trigger (and more importantly
>> display the results of) jobs on a per push level.
>> 2) Tooling can help make these decisions much more effectively and
>> quickly than someone picking through bugs - ie we should expand the
>> current "only schedule job X if directory Y changed" buildbotcustom
>> logic further.
>> 3) Adding a human in the workflow increases r+-to-committed cycle times,
>> uses up scarce sheriff time, and also means the person who wrote the
>> patch is not the one landing it, and so someone unfamiliar with the code
>> often ends up being the one to resolve conflicts. We should be using
>> tooling, not human cycles to lands patches in a repo (ie the
>> long-promised autoland).
> 
> Another approach to this would be to identify more patterns that allow
> us to skip some jobs.  We already do this for very simple things
> (changes to a file in browser/ won't trigger b2g and Android builds, for
> example), but I'm sure we could do more.  For example, changes to files
> under widget/ may only affect one platform, depending on which
> directories you touch.  Another idea that I have had is adding some
> smarts to make it possible to parse the test manifest files, and
> recognize things such as skip-if, to figure out what platforms a test
> only change for example might not affect, and skip the builds and tests
> on those platforms.
> 
> One thing to note is that there is going to be a *long* tail of these
> types of heuristics that we could come up with, so it would be nice to
> try to only address the ones that provide the most benefits.  For that,
> someone needs to look at the recent N commits on a given branch and
> figure out what jobs we _could_ have safely skipped for each one.

If someone wants to have a look at doing this more intelligently, the
relevant code is reasonably isolated in
https://github.com/mozilla/build-buildbotcustom/blob/master/misc.py#L127. The
object received there is a Buildbot Change object, which contains most
(all) of the metadata about a revision:
https://mxr.mozilla.org/build-central/source/buildbot/master/buildbot/changes/changes.py#11

I believe this is called once for every revision in a push.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-20 Thread Ehsan Akhgari

On 2014-08-20, 10:58 AM, Ed Morley wrote:

On 19/08/2014 21:55, Benoit Girard wrote:

I completely agree with Jeff Gilbert on this one.

I think we should try to coalesce -better-. I just checked the current
state of mozilla-inbound and it doesn't feel any of the current patch
really need their own set of tests because they're are not time
sensitive or sufficiently complex. Right now developers are asked to
create bugs for their own change with their own patch. This leads to a
lot of little patches being landed by individual developers which
seems to reflect the current state of mozilla-inbound.

Perhaps we should instead promote checkin-needed (or a similar simple)
to coalesce simple changes together. Opting into this means that your
patch may take significantly longer to get merged if it's landed with
another bad patch and should only be used when that's acceptable.
Right now developers with commit access are not encouraged to make use
of checkin-needed AFAIK. If we started recommending against individual
landings for simple changes, and improved the process, we could
probably significantly cut the number of tests jobs by cutting the
number of pushes.


I agree we should try to coalesce better - however doing this via a
manual "let's get someone to push a bunch of checkin-needed patches in
one go" is suboptimal:
1) By tweaking coalescing in buildbot & pushing patches individually, we
could get the same build+test job per commit ratio as doing
checkin-neededs, but with the bonus of being able to backfill jobs where
needed. This isn't possible when say 10-20 checkin-neededs are landed in
one push, since our tooling can only trigger (and more importantly
display the results of) jobs on a per push level.
2) Tooling can help make these decisions much more effectively and
quickly than someone picking through bugs - ie we should expand the
current "only schedule job X if directory Y changed" buildbotcustom
logic further.
3) Adding a human in the workflow increases r+-to-committed cycle times,
uses up scarce sheriff time, and also means the person who wrote the
patch is not the one landing it, and so someone unfamiliar with the code
often ends up being the one to resolve conflicts. We should be using
tooling, not human cycles to lands patches in a repo (ie the
long-promised autoland).


Another approach to this would be to identify more patterns that allow 
us to skip some jobs.  We already do this for very simple things 
(changes to a file in browser/ won't trigger b2g and Android builds, for 
example), but I'm sure we could do more.  For example, changes to files 
under widget/ may only affect one platform, depending on which 
directories you touch.  Another idea that I have had is adding some 
smarts to make it possible to parse the test manifest files, and 
recognize things such as skip-if, to figure out what platforms a test 
only change for example might not affect, and skip the builds and tests 
on those platforms.


One thing to note is that there is going to be a *long* tail of these 
types of heuristics that we could come up with, so it would be nice to 
try to only address the ones that provide the most benefits.  For that, 
someone needs to look at the recent N commits on a given branch and 
figure out what jobs we _could_ have safely skipped for each one.


Cheers,
Ehsan

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-20 Thread Ed Morley

On 19/08/2014 21:55, Benoit Girard wrote:

I completely agree with Jeff Gilbert on this one.

I think we should try to coalesce -better-. I just checked the current
state of mozilla-inbound and it doesn't feel any of the current patch
really need their own set of tests because they're are not time
sensitive or sufficiently complex. Right now developers are asked to
create bugs for their own change with their own patch. This leads to a
lot of little patches being landed by individual developers which
seems to reflect the current state of mozilla-inbound.

Perhaps we should instead promote checkin-needed (or a similar simple)
to coalesce simple changes together. Opting into this means that your
patch may take significantly longer to get merged if it's landed with
another bad patch and should only be used when that's acceptable.
Right now developers with commit access are not encouraged to make use
of checkin-needed AFAIK. If we started recommending against individual
landings for simple changes, and improved the process, we could
probably significantly cut the number of tests jobs by cutting the
number of pushes.


I agree we should try to coalesce better - however doing this via a 
manual "let's get someone to push a bunch of checkin-needed patches in 
one go" is suboptimal:
1) By tweaking coalescing in buildbot & pushing patches individually, we 
could get the same build+test job per commit ratio as doing 
checkin-neededs, but with the bonus of being able to backfill jobs where 
needed. This isn't possible when say 10-20 checkin-neededs are landed in 
one push, since our tooling can only trigger (and more importantly 
display the results of) jobs on a per push level.
2) Tooling can help make these decisions much more effectively and 
quickly than someone picking through bugs - ie we should expand the 
current "only schedule job X if directory Y changed" buildbotcustom 
logic further.
3) Adding a human in the workflow increases r+-to-committed cycle times, 
uses up scarce sheriff time, and also means the person who wrote the 
patch is not the one landing it, and so someone unfamiliar with the code 
often ends up being the one to resolve conflicts. We should be using 
tooling, not human cycles to lands patches in a repo (ie the 
long-promised autoland).


Best wishes,

Ed
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-20 Thread Benjamin Smedberg

On 8/20/2014 3:07 AM, Mike Hommey wrote:


Optimized builds have been the default for a while, if not ever[1].


Bug 54828 made optimized builds the default in 2004 right before we 
released Firefox 1.0. It only took four years to make that decision ;-)


--BDS

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-20 Thread Mike Hommey
On Tue, Aug 19, 2014 at 11:26:42PM -0700, Jeff Gilbert wrote:
> I was just going to ask about this. I glanced through the mozconfigs
> in the tree for at least Linux debug, but it looks like it only has
> --enable-debug, not even -O1. Maybe it's buried somewhere in there,
> but I didn't find it with a quick look.
> 
> I took a look at the build log for WinXP debug, and --enable-opt is
> only present on the configure line for nspr, whereas --enable-debug is
> in a number of other places.

Optimized builds have been the default for a while, if not ever[1]. So
unless you add an explicit --disable-optimize, you still get an
optimized build, whether you use --enable-debug or not.

As a matter of fact, we *did* have --disable-optimize in the debug build
mozconfigs, but that was removed 3 years ago, in bug 669953.

Mike

1. At least, it was the case in the oldest tree we have in mercurial.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-19 Thread Jeff Gilbert
I was just going to ask about this. I glanced through the mozconfigs in the 
tree for at least Linux debug, but it looks like it only has --enable-debug, 
not even -O1. Maybe it's buried somewhere in there, but I didn't find it with a 
quick look.

I took a look at the build log for WinXP debug, and --enable-opt is only 
present on the configure line for nspr, whereas --enable-debug is in a number 
of other places.

Can we get confirmation for whether debug builds are (partially?) optimized? If 
not, we should do that. (Unless I'm missing a reason not to, especially if we 
only care about pass/fail, and not crash stacks/debugability)

-Jeff

- Original Message -
From: "Kyle Huey" 
To: "Joshua Cranmer 🐧" 
Cc: "dev-platform" 
Sent: Tuesday, August 19, 2014 3:56:27 PM
Subject: Re: Experiment with running debug tests less often on mozilla-inbound  
the week of August 25

I'm pretty sure the debug builds on our CI infrastructure are built
with optimization.

- Kyle

On Tue, Aug 19, 2014 at 3:42 PM, Joshua Cranmer 🐧  wrote:
> On 8/19/2014 5:25 PM, Ehsan Akhgari wrote:
>>
>> Yep, the debug tests indeed take more time, mostly because they run more
>> checks.
>
>
> Actually, the bigger cause in the slowdown is probably that debug tests
> don't have any optimizations, not more checks. An atomic increment on a
> debug build invokes something like a hundred instructions (including several
> call instructions) whereas the equivalent operation on an opt build is just
> one.
>
> --
> Joshua Cranmer
> Thunderbird and DXR developer
> Source code archæologist
>
>
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-19 Thread Kyle Huey
I'm pretty sure the debug builds on our CI infrastructure are built
with optimization.

- Kyle

On Tue, Aug 19, 2014 at 3:42 PM, Joshua Cranmer 🐧  wrote:
> On 8/19/2014 5:25 PM, Ehsan Akhgari wrote:
>>
>> Yep, the debug tests indeed take more time, mostly because they run more
>> checks.
>
>
> Actually, the bigger cause in the slowdown is probably that debug tests
> don't have any optimizations, not more checks. An atomic increment on a
> debug build invokes something like a hundred instructions (including several
> call instructions) whereas the equivalent operation on an opt build is just
> one.
>
> --
> Joshua Cranmer
> Thunderbird and DXR developer
> Source code archæologist
>
>
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-19 Thread Joshua Cranmer 🐧

On 8/19/2014 5:25 PM, Ehsan Akhgari wrote:
Yep, the debug tests indeed take more time, mostly because they run 
more checks.


Actually, the bigger cause in the slowdown is probably that debug tests 
don't have any optimizations, not more checks. An atomic increment on a 
debug build invokes something like a hundred instructions (including 
several call instructions) whereas the equivalent operation on an opt 
build is just one.


--
Joshua Cranmer
Thunderbird and DXR developer
Source code archæologist

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-19 Thread Jonathan Griffin
No, fx-team is not affected by this experiment; we intend to target 
mozilla-inbound only for this 1-week trial.  The reason is that the 
number of commits on m-i seems larger than fx-team, and therefore the 
impacts should be more visible.


Jonathan

On 8/19/2014 3:19 PM, Matthew N. wrote:

On 8/19/14 12:22 PM, Jonathan Griffin wrote:

To assess the impact of doing this, we will be performing an experiment
the week of August 25, in which we will run debug tests on
mozilla-inbound on most desktop platforms every other run, instead of
every run as we do now.  Debug tests on linux64 will continue to run
every time.  Non-desktop platforms and trees other than mozilla-inbound
will not be affected.


To clarify, is fx-team affected by this change? I ask because you 
mention "desktop" and that is where the desktop front-end team does 
landings. I suspect fx-team landings are less likely to hit debug-only 
issues than mozilla-inbound as fx-team has much fewer C++ changes and 
anecdotally JS-only changes seem to trigger debug-only failures less 
often.



This approach is based on the premise that the number of debug-only
platform-specific failures on desktop is low enough to be manageable,
and that the extra burden this imposes on the sheriffs will be small
enough compared to the improvement in test slave metrics to justify the
cost.


FWIW, I think fx-team is more desktop-specific (although Android 
front-end stuff also lands there and I'm not familiar with that).


MattN
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-19 Thread Trevor Saunders
On Tue, Aug 19, 2014 at 02:49:48PM -0700, Jonathan Griffin wrote:
> On 8/19/2014 2:41 PM, Ehsan Akhgari wrote:
> >On 2014-08-19, 3:57 PM, Jeff Gilbert wrote:
> >>I would actually say that debug tests are more important for continuous
> >>integration than opt tests. At least in code I deal with, we have a ton
> >>of asserts to guarantee behavior, and we really want test coverage with
> >>these via CI. If a test passes on debug, it should almost certainly pass
> >>on opt, just faster. The opposite is not true.
> >>
> >>"They take a long time and then break" is part of what I believe caused
> >>us to not bother with debug testing on much of Android and B2G, which we
> >>still haven't completely fixed. It should be unacceptable to ship
> >>without CI on debug tests, but here we are anyways. (This is finally
> >>nearly fixed, though there is still some work to do)
> >>
> >>I'm not saying running debug tests less often is on the same scale of
> >>bad, but I would like to express my concerns about heading in that
> >>direction.
> >
> >I second this.  I'm curious to know why you picked debug tests for this
> >experiment.  Would it not make more sense to run opt tests on desktop
> >platforms on every other run?
> >
> Just based on the fact that they take longer and thus running them less
> frequently would have a larger impact.  If there's a broad consensus that
> debug runs are more valuable, we could switch to running opt tests less
> frequently instead.

It seems to me our goal here is basically to pick so that the expected
time to detect bustage is minimized without increasing the maximum time
it can take to detect bustage.  That is take p(d) to be the probability
only debug tests will fail, p(o) the probability only opt tests will
fail, and p(b) the probability both will fail.  Then take t(d) and t(o)
the time for a debug and opt test to run respectively.  Now you want to
decide which to run first debug or opt.  you'd expect that if you choose
debug you'd expect to detect bustage in
(p(d) + p(b)) * t(d) + p(o) * (t(o) + t(d))
which simplifies to
t(d) + p(o) * t(o)
On the other hand if you choose to test opt first you get
t(o) + p(d) * t(d)

I suspect we all agree t(d) > t(o) and it seems likely p(d) > p(o), but
it should be clear which is the better choice depends on the exact
values of those numbers (and  this is not a good model of reality in
many ways).

Trev

> 
> Jonathan
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-19 Thread David Burns
I know this is tangential but the small changes are the least tested 
changes in my experience. The try push requirement for checkin-needed 
has had a wonderful impact on the amount of times the tree is closed[1]. 
The tree is less likely to be closed these days.


David

[1] http://futurama.theautomatedtester.co.uk/

On 19/08/2014 22:04, Ralph Giles wrote:

On 2014-08-19 1:55 PM, Benoit Girard wrote:

Perhaps we should instead promote checkin-needed (or a similar simple)
to coalesce simple changes together.

I would prefer to use 'checkin-needed' for more things, but am blocked
by the try-needed requirement. We need some way to bless small changes
for inbound without a try push. Look up the author's commit access maybe?

  -r
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-19 Thread Ehsan Akhgari

On 2014-08-19, 5:49 PM, Jonathan Griffin wrote:

On 8/19/2014 2:41 PM, Ehsan Akhgari wrote:

On 2014-08-19, 3:57 PM, Jeff Gilbert wrote:

I would actually say that debug tests are more important for
continuous integration than opt tests. At least in code I deal with,
we have a ton of asserts to guarantee behavior, and we really want
test coverage with these via CI. If a test passes on debug, it should
almost certainly pass on opt, just faster. The opposite is not true.

"They take a long time and then break" is part of what I believe
caused us to not bother with debug testing on much of Android and
B2G, which we still haven't completely fixed. It should be
unacceptable to ship without CI on debug tests, but here we are
anyways. (This is finally nearly fixed, though there is still some
work to do)

I'm not saying running debug tests less often is on the same scale of
bad, but I would like to express my concerns about heading in that
direction.


I second this.  I'm curious to know why you picked debug tests for
this experiment.  Would it not make more sense to run opt tests on
desktop platforms on every other run?


Just based on the fact that they take longer and thus running them less
frequently would have a larger impact.  If there's a broad consensus
that debug runs are more valuable, we could switch to running opt tests
less frequently instead.


Yep, the debug tests indeed take more time, mostly because they run more 
checks.  :-)  The checks in opt builds are not exactly a subset of the 
ones in debug builds, but they are close.  Based on that, I think 
running opt tests on every other push is a more conservative one, and I 
support it more.  That being said, for this one week limited trial, 
given that the sheriffs will help backfill the skipped tests, I don't 
care very strongly about this, as long as it doesn't set the precedence 
that we can ignore debug tests!


Cheers,
Ehsan

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-19 Thread Matthew N.

On 8/19/14 12:22 PM, Jonathan Griffin wrote:

To assess the impact of doing this, we will be performing an experiment
the week of August 25, in which we will run debug tests on
mozilla-inbound on most desktop platforms every other run, instead of
every run as we do now.  Debug tests on linux64 will continue to run
every time.  Non-desktop platforms and trees other than mozilla-inbound
will not be affected.


To clarify, is fx-team affected by this change? I ask because you 
mention "desktop" and that is where the desktop front-end team does 
landings. I suspect fx-team landings are less likely to hit debug-only 
issues than mozilla-inbound as fx-team has much fewer C++ changes and 
anecdotally JS-only changes seem to trigger debug-only failures less often.



This approach is based on the premise that the number of debug-only
platform-specific failures on desktop is low enough to be manageable,
and that the extra burden this imposes on the sheriffs will be small
enough compared to the improvement in test slave metrics to justify the
cost.


FWIW, I think fx-team is more desktop-specific (although Android 
front-end stuff also lands there and I'm not familiar with that).


MattN
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-19 Thread Jonathan Griffin

On 8/19/2014 2:41 PM, Ehsan Akhgari wrote:

On 2014-08-19, 3:57 PM, Jeff Gilbert wrote:
I would actually say that debug tests are more important for 
continuous integration than opt tests. At least in code I deal with, 
we have a ton of asserts to guarantee behavior, and we really want 
test coverage with these via CI. If a test passes on debug, it should 
almost certainly pass on opt, just faster. The opposite is not true.


"They take a long time and then break" is part of what I believe 
caused us to not bother with debug testing on much of Android and 
B2G, which we still haven't completely fixed. It should be 
unacceptable to ship without CI on debug tests, but here we are 
anyways. (This is finally nearly fixed, though there is still some 
work to do)


I'm not saying running debug tests less often is on the same scale of 
bad, but I would like to express my concerns about heading in that 
direction.


I second this.  I'm curious to know why you picked debug tests for 
this experiment.  Would it not make more sense to run opt tests on 
desktop platforms on every other run?


Just based on the fact that they take longer and thus running them less 
frequently would have a larger impact.  If there's a broad consensus 
that debug runs are more valuable, we could switch to running opt tests 
less frequently instead.


Jonathan
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-19 Thread Jonathan Griffin
I also agree about coalescing better.  We are looking at ways to do that 
in conjunction with 
https://wiki.mozilla.org/Auto-tools/Projects/Autoland, which we'll have 
a prototype of by the end of the quarter.  In this model, commits that 
are going through autoland could be coalesced when landing on inbound, 
which would reduce slave load on all platforms.


Until that's deployed and in widespread use, we have other options to 
decrease slave load, and this experiment is the simplest.  It won't 
result in reduced test coverage, since sheriffs will backfill in the 
case of a regression.  Essentially, we're not running tests that would 
have passed anyway.


Depending on feedback we receive after this experiment, we may opt to 
change our approach in the future:  i.e., run tests every Nth opt build 
instead of debug build, or try to identify sets of "never failing" tests 
and just run those less frequently, or always include at least one 
flavor of Windows, OSX and Linux on every commit, etc.


Regards,

Jonathan


On 8/19/2014 1:55 PM, Benoit Girard wrote:

I completely agree with Jeff Gilbert on this one.

I think we should try to coalesce -better-. I just checked the current
state of mozilla-inbound and it doesn't feel any of the current patch
really need their own set of tests because they're are not time
sensitive or sufficiently complex. Right now developers are asked to
create bugs for their own change with their own patch. This leads to a
lot of little patches being landed by individual developers which
seems to reflect the current state of mozilla-inbound.

Perhaps we should instead promote checkin-needed (or a similar simple)
to coalesce simple changes together. Opting into this means that your
patch may take significantly longer to get merged if it's landed with
another bad patch and should only be used when that's acceptable.
Right now developers with commit access are not encouraged to make use
of checkin-needed AFAIK. If we started recommending against individual
landings for simple changes, and improved the process, we could
probably significantly cut the number of tests jobs by cutting the
number of pushes.

On Tue, Aug 19, 2014 at 3:57 PM, Jeff Gilbert  wrote:

I would actually say that debug tests are more important for continuous 
integration than opt tests. At least in code I deal with, we have a ton of 
asserts to guarantee behavior, and we really want test coverage with these via 
CI. If a test passes on debug, it should almost certainly pass on opt, just 
faster. The opposite is not true.

"They take a long time and then break" is part of what I believe caused us to 
not bother with debug testing on much of Android and B2G, which we still haven't 
completely fixed. It should be unacceptable to ship without CI on debug tests, but here 
we are anyways. (This is finally nearly fixed, though there is still some work to do)

I'm not saying running debug tests less often is on the same scale of bad, but 
I would like to express my concerns about heading in that direction.

-Jeff

- Original Message -
From: "Jonathan Griffin" 
To: dev-platform@lists.mozilla.org
Sent: Tuesday, August 19, 2014 12:22:21 PM
Subject: Experiment with running debug tests less often on mozilla-inbound  
the week of August 25

Our pools of test slaves are often at or over capacity, and this has the
effect of increasing job coalescing and test wait times.  This, in turn,
can lead to longer tree closures caused by test bustage, and can cause
try runs to be very slow to complete.

One of the easiest ways to mitigate this is to run tests less often.

To assess the impact of doing this, we will be performing an experiment
the week of August 25, in which we will run debug tests on
mozilla-inbound on most desktop platforms every other run, instead of
every run as we do now.  Debug tests on linux64 will continue to run
every time.  Non-desktop platforms and trees other than mozilla-inbound
will not be affected.

This approach is based on the premise that the number of debug-only
platform-specific failures on desktop is low enough to be manageable,
and that the extra burden this imposes on the sheriffs will be small
enough compared to the improvement in test slave metrics to justify the
cost.

While this experiment is in progress, we will be monitoring job
coalescing and test wait times, as well as impacts on sheriffs and
developers.  If the experiment causes sheriffs to be unable to perform
their job effectively, it can be terminated prematurely.

We intend to use the data we collect during the experiment to inform
decisions about additional tooling we need to make this or a similar
plan permanent at some point in the future, as well as validating the
premise on which this experiment is based.

After the conclusion of this experiment, a follow-up post will be made
which will discuss 

Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-19 Thread Ehsan Akhgari

On 2014-08-19, 3:57 PM, Jeff Gilbert wrote:

I would actually say that debug tests are more important for continuous 
integration than opt tests. At least in code I deal with, we have a ton of 
asserts to guarantee behavior, and we really want test coverage with these via 
CI. If a test passes on debug, it should almost certainly pass on opt, just 
faster. The opposite is not true.

"They take a long time and then break" is part of what I believe caused us to 
not bother with debug testing on much of Android and B2G, which we still haven't 
completely fixed. It should be unacceptable to ship without CI on debug tests, but here 
we are anyways. (This is finally nearly fixed, though there is still some work to do)

I'm not saying running debug tests less often is on the same scale of bad, but 
I would like to express my concerns about heading in that direction.


I second this.  I'm curious to know why you picked debug tests for this 
experiment.  Would it not make more sense to run opt tests on desktop 
platforms on every other run?


Cheers,
Ehsan

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-19 Thread Ralph Giles
On 2014-08-19 1:55 PM, Benoit Girard wrote:
> Perhaps we should instead promote checkin-needed (or a similar simple)
> to coalesce simple changes together.

I would prefer to use 'checkin-needed' for more things, but am blocked
by the try-needed requirement. We need some way to bless small changes
for inbound without a try push. Look up the author's commit access maybe?

 -r
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-19 Thread Benoit Girard
I completely agree with Jeff Gilbert on this one.

I think we should try to coalesce -better-. I just checked the current
state of mozilla-inbound and it doesn't feel any of the current patch
really need their own set of tests because they're are not time
sensitive or sufficiently complex. Right now developers are asked to
create bugs for their own change with their own patch. This leads to a
lot of little patches being landed by individual developers which
seems to reflect the current state of mozilla-inbound.

Perhaps we should instead promote checkin-needed (or a similar simple)
to coalesce simple changes together. Opting into this means that your
patch may take significantly longer to get merged if it's landed with
another bad patch and should only be used when that's acceptable.
Right now developers with commit access are not encouraged to make use
of checkin-needed AFAIK. If we started recommending against individual
landings for simple changes, and improved the process, we could
probably significantly cut the number of tests jobs by cutting the
number of pushes.

On Tue, Aug 19, 2014 at 3:57 PM, Jeff Gilbert  wrote:
> I would actually say that debug tests are more important for continuous 
> integration than opt tests. At least in code I deal with, we have a ton of 
> asserts to guarantee behavior, and we really want test coverage with these 
> via CI. If a test passes on debug, it should almost certainly pass on opt, 
> just faster. The opposite is not true.
>
> "They take a long time and then break" is part of what I believe caused us to 
> not bother with debug testing on much of Android and B2G, which we still 
> haven't completely fixed. It should be unacceptable to ship without CI on 
> debug tests, but here we are anyways. (This is finally nearly fixed, though 
> there is still some work to do)
>
> I'm not saying running debug tests less often is on the same scale of bad, 
> but I would like to express my concerns about heading in that direction.
>
> -Jeff
>
> - Original Message -
> From: "Jonathan Griffin" 
> To: dev-platform@lists.mozilla.org
> Sent: Tuesday, August 19, 2014 12:22:21 PM
> Subject: Experiment with running debug tests less often on mozilla-inbound
>   the week of August 25
>
> Our pools of test slaves are often at or over capacity, and this has the
> effect of increasing job coalescing and test wait times.  This, in turn,
> can lead to longer tree closures caused by test bustage, and can cause
> try runs to be very slow to complete.
>
> One of the easiest ways to mitigate this is to run tests less often.
>
> To assess the impact of doing this, we will be performing an experiment
> the week of August 25, in which we will run debug tests on
> mozilla-inbound on most desktop platforms every other run, instead of
> every run as we do now.  Debug tests on linux64 will continue to run
> every time.  Non-desktop platforms and trees other than mozilla-inbound
> will not be affected.
>
> This approach is based on the premise that the number of debug-only
> platform-specific failures on desktop is low enough to be manageable,
> and that the extra burden this imposes on the sheriffs will be small
> enough compared to the improvement in test slave metrics to justify the
> cost.
>
> While this experiment is in progress, we will be monitoring job
> coalescing and test wait times, as well as impacts on sheriffs and
> developers.  If the experiment causes sheriffs to be unable to perform
> their job effectively, it can be terminated prematurely.
>
> We intend to use the data we collect during the experiment to inform
> decisions about additional tooling we need to make this or a similar
> plan permanent at some point in the future, as well as validating the
> premise on which this experiment is based.
>
> After the conclusion of this experiment, a follow-up post will be made
> which will discuss our findings.  If you have any concerns, feel free to
> reach out to me.
>
> Jonathan
>
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-19 Thread Jeff Gilbert
I would actually say that debug tests are more important for continuous 
integration than opt tests. At least in code I deal with, we have a ton of 
asserts to guarantee behavior, and we really want test coverage with these via 
CI. If a test passes on debug, it should almost certainly pass on opt, just 
faster. The opposite is not true.

"They take a long time and then break" is part of what I believe caused us to 
not bother with debug testing on much of Android and B2G, which we still 
haven't completely fixed. It should be unacceptable to ship without CI on debug 
tests, but here we are anyways. (This is finally nearly fixed, though there is 
still some work to do)

I'm not saying running debug tests less often is on the same scale of bad, but 
I would like to express my concerns about heading in that direction.

-Jeff

- Original Message -
From: "Jonathan Griffin" 
To: dev-platform@lists.mozilla.org
Sent: Tuesday, August 19, 2014 12:22:21 PM
Subject: Experiment with running debug tests less often on mozilla-inbound  
the week of August 25

Our pools of test slaves are often at or over capacity, and this has the 
effect of increasing job coalescing and test wait times.  This, in turn, 
can lead to longer tree closures caused by test bustage, and can cause 
try runs to be very slow to complete.

One of the easiest ways to mitigate this is to run tests less often.

To assess the impact of doing this, we will be performing an experiment 
the week of August 25, in which we will run debug tests on 
mozilla-inbound on most desktop platforms every other run, instead of 
every run as we do now.  Debug tests on linux64 will continue to run 
every time.  Non-desktop platforms and trees other than mozilla-inbound 
will not be affected.

This approach is based on the premise that the number of debug-only 
platform-specific failures on desktop is low enough to be manageable, 
and that the extra burden this imposes on the sheriffs will be small 
enough compared to the improvement in test slave metrics to justify the 
cost.

While this experiment is in progress, we will be monitoring job 
coalescing and test wait times, as well as impacts on sheriffs and 
developers.  If the experiment causes sheriffs to be unable to perform 
their job effectively, it can be terminated prematurely.

We intend to use the data we collect during the experiment to inform 
decisions about additional tooling we need to make this or a similar 
plan permanent at some point in the future, as well as validating the 
premise on which this experiment is based.

After the conclusion of this experiment, a follow-up post will be made 
which will discuss our findings.  If you have any concerns, feel free to 
reach out to me.

Jonathan

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-19 Thread Jonathan Griffin
Our pools of test slaves are often at or over capacity, and this has the 
effect of increasing job coalescing and test wait times.  This, in turn, 
can lead to longer tree closures caused by test bustage, and can cause 
try runs to be very slow to complete.


One of the easiest ways to mitigate this is to run tests less often.

To assess the impact of doing this, we will be performing an experiment 
the week of August 25, in which we will run debug tests on 
mozilla-inbound on most desktop platforms every other run, instead of 
every run as we do now.  Debug tests on linux64 will continue to run 
every time.  Non-desktop platforms and trees other than mozilla-inbound 
will not be affected.


This approach is based on the premise that the number of debug-only 
platform-specific failures on desktop is low enough to be manageable, 
and that the extra burden this imposes on the sheriffs will be small 
enough compared to the improvement in test slave metrics to justify the 
cost.


While this experiment is in progress, we will be monitoring job 
coalescing and test wait times, as well as impacts on sheriffs and 
developers.  If the experiment causes sheriffs to be unable to perform 
their job effectively, it can be terminated prematurely.


We intend to use the data we collect during the experiment to inform 
decisions about additional tooling we need to make this or a similar 
plan permanent at some point in the future, as well as validating the 
premise on which this experiment is based.


After the conclusion of this experiment, a follow-up post will be made 
which will discuss our findings.  If you have any concerns, feel free to 
reach out to me.


Jonathan

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform