On-going Treeherder issues

2019-11-21 Thread Armen Zambrano Gasparnian
Hello all,
As you may have noticed the trees have been closed most of the day. There
are on-going database issues that we're trying to solve.

For up-to-date information you can join #treeherder on IRC or follow
https://bugzilla.mozilla.org/show_bug.cgi?id=1597136.

We appreciate your patience.

Thank you,
Treeherder team

--
Armen Zambrano G.
https://armenzg.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: How can I run Firefox programatically in fullscreen?

2017-06-27 Thread Armen Zambrano Gasparnian
The main idea behind my question was to change the AWFY speedometer benchmark 
to start Firefox in full screen (since the score is dependent to window size).
After second thought, for this to work, I would also have to figure out how to 
make Chrome and Edge start maximized (my apologies; I used full screen when I 
should have meant maximized).
It would be ideal if I could figure out all three browsers, however, with other 
work and time constraints I will have to leave this idea on the side.

Thank you all for chiming in,
Armen

On Jun 26, 2017, 8:32 PM -0400, Michael Cooper <mcoo...@mozilla.com>, wrote:
> I'm not sure this is quite what you're looking for, but for Corsica (the 
> software powering the ambient displays around the office) we do this by 
> setting the pref full-screen-api.allow-trusted-requests-only to false. This 
> then allows the webpage we load (Corsica) to immediately request full screen 
> using the normal DOM methods.
>
> > On Mon, Jun 26, 2017 at 2:12 PM, Armen Zambrano Gasparnian 
> > <arme...@mozilla.com> wrote:
> > > Asking around, looking on dxr or MDN did not yield something easily.
> > >
> > > I don't want to have to use Marionette in this specific automation 
> > > context.
> > >
> > > Thanks in advance,
> > > Armen
> > > ___
> > > dev-platform mailing list
> > > dev-platform@lists.mozilla.org
> > > https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


How can I run Firefox programatically in fullscreen?

2017-06-26 Thread Armen Zambrano Gasparnian
Asking around, looking on dxr or MDN did not yield something easily.

I don't want to have to use Marionette in this specific automation context.

Thanks in advance,
Armen
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Quantum Flow Engineering Newsletter #14

2017-06-26 Thread Armen Zambrano Gasparnian
On Friday, 23 June 2017 10:05:42 UTC-4, Ehsan Akhgari  wrote:
> On Fri, Jun 23, 2017 at 4:19 AM, Chris Peterson 
> wrote:
> 
> >
> >
> > On 6/23/17 12:17 AM, Ehsan Akhgari wrote:
> >
> > But to speak of a more direct measurement of performance, let's look at
> > our progress on Speedometer V2
> > .

Ehsan, were you comparing against http://speedometer2.benj.me?
Or this http://browserbench.org/Speedometer/?
Or using the AWFY code? (It uses local proxy)

> > Today, I measured our progress so far on this benchmark by comparing
> > Firefox 53, 54, 55.0b3 (latest beta as of this writing) and the latest
> > Nightly, all x64 builds, on the reference hardware
> > .  This is the result (numbers are
> > the reported benchmark score, higher is better):
> >
> > [image: Speedometer improvements]
> >
> >
> > How do these Speedometer V2 scores map to the results on AWFY? AWFY shows
> > many Speedometer sub-tests, but no score in the range of 70.21. AWFY
> > machine #36 is the reference hardware.
> >
> > https://arewefastyet.com/#machine=36=breakdown=speedometer-misc
> >
> 
> Armen has been investigating the difference.  The Speedometer benchmark is
> extremely sensitive to anything else that is going on on the machine at the
> time you are running the tests, see for example
> https://bugzilla.mozilla.org/show_bug.cgi?id=1373396#c3 where he discovered
> that turning off the "Shut off display after 15 minutes" setting improves
> our benchmark score by about 10 points or so!  I think there are other
> investigations ongoing to dig into the remaining difference as well.
> 

Currently, machine #36 is testing against non-PGO builds from inbound (to catch 
regressions).
I'm looking into adding mozilla-central PGO builds to the mix.
I assume PGO builds can run slightly faster than non-PGO builds.

Once we have PGO builds we will be able to see if there are anymore machine 
configration changes required (I doubt it).

Direct link to the speedometer score on machine #36:
https://arewefastyet.com/#machine=36=single=speedometer-misc=score

> (Also note that there is a speed difference on Speedometer between Nightly
> and Beta, where Nightly with the same code will be a bit slower than Beta
> since some Nightly specific features do show up in Speedometer profiles
> currently, for example things like bug 1375568 are currently Nightly only,
> and there are also debugging checks like <
> https://searchfox.org/mozilla-central/rev/3291398f10dcbe192fb52e74974b172616c018aa/ipc/chromium/src/base/pickle.h#26>
> that show up a bit in profiles as well.  I think that explains why 55.0b3
> is scoring so high comparing to 56.0a1 there.)
> 
> Cheers,
> -- 
> Ehsan

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Fwd: Scheduling requests misbehaving in the last two days

2016-11-07 Thread Armen Zambrano G.

It seems I forgot to hit send on Friday.
You can request again "add new jobs" and "backfill" requests.

Again, my apologies for this.

regards,
Armen

On 2016-11-01 08:55 PM, Armen Zambrano G. wrote:

Spreading the original information beyond the original mailing list.

Unfortunately, requests to backfill jobs or adding new jobs (and
similar) are not keeping up. The requests are not being processed fast
enough.

I will email again once it is resolved.

If you're curious, here's the bug I'm using
https://bugzilla.mozilla.org/show_bug.cgi?id=1314396

My apologies for this inconvenience.

regards,
Armen

 Forwarded Message 
Subject: Scheduling requests misbehaving in the last two days
Date: Tue, 1 Nov 2016 14:21:12 -0400
From: Armen Zambrano G. <arme...@mozilla.com>
Organization: Mozilla Corporation
To: dev-tree-managem...@lists.mozilla.org
Newsgroups: mozilla.dev.tree-management

If you've tried to add new jobs, backfill or similar special scheduling
requests from Treeherder, you might have noticed improper behaviour.

Unfortunately, I've introduced a bug in the last few days that has been
a bit difficult to recover from.

I've changed the design of pulse_actions to help handle these situations
better in the future.

I believe we're back to normal. My apologies for the inconvenience.

Please let me know if you're interested on more details.

regards,
Armen




--
Zambrano Gasparnian, Armen
Engineering productivity
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Fwd: Scheduling requests misbehaving in the last two days

2016-11-01 Thread Armen Zambrano G.

Spreading the original information beyond the original mailing list.

Unfortunately, requests to backfill jobs or adding new jobs (and 
similar) are not keeping up. The requests are not being processed fast 
enough.


I will email again once it is resolved.

If you're curious, here's the bug I'm using
https://bugzilla.mozilla.org/show_bug.cgi?id=1314396

My apologies for this inconvenience.

regards,
Armen

 Forwarded Message 
Subject: Scheduling requests misbehaving in the last two days
Date: Tue, 1 Nov 2016 14:21:12 -0400
From: Armen Zambrano G. <arme...@mozilla.com>
Organization: Mozilla Corporation
To: dev-tree-managem...@lists.mozilla.org
Newsgroups: mozilla.dev.tree-management

If you've tried to add new jobs, backfill or similar special scheduling 
requests from Treeherder, you might have noticed improper behaviour.


Unfortunately, I've introduced a bug in the last few days that has been 
a bit difficult to recover from.


I've changed the design of pulse_actions to help handle these situations 
better in the future.


I believe we're back to normal. My apologies for the inconvenience.

Please let me know if you're interested on more details.

regards,
Armen

--
Zambrano Gasparnian, Armen
Engineering productivity
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Usability improvements for Firefox automation initiative - Status update #8

2016-10-28 Thread Armen Zambrano G.

NEW:
* Debugging on remote workers is not a main priority for this quarter 
since it got completed

* We've added investigating hyper chunking

On this update we will look at the progress made in the last two weeks.

This quarter’s main focus is on improving end to end times on Try 
(Thunder Try project).


For all bugs and priorities you can check out:
https://wiki.mozilla.org/EngineeringProductivity/Projects/Debugging_UX_improvements


Thunder Try - Improve end to end times on try
-
Project #1 - Artifact builds on automation
##
Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1284882


Accomplished recently:
* All --artifact build jobs now provide crash-reporter symbols.
** This means test jobs scheduled against linux64 _debug_ work with 
--artifact.



Upcoming:
* Patch in review - debug artifact builds on try.
** (Right now --artifact always results in an opt artifact build.)


Project #2 - S3 Cloud Compiler Cache

Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1280641


Accomplished recently:
* Green builds on all platforms on Try

Upcoming:
* Compare build times vs. existing Python sccache now


Project #3 - Metrics

Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1286856


Accomplished recently:
* Stabilized dashboard 
http://people.mozilla.org/~klahnakoski/MoBuildbotTimings/End-to-End.html
* TaskCluster ingestion is Mozharness steps is now working, but not 
shown on charts yet.



Upcoming:
* Still has the “small population” problem, where outliers make the 90th 
percentile look big.

* Add a TaskCluster view of End-to-End times
* Start knocking off bugs in the dashboard


Project #4 - Build automation improvements
##
Nothing new for this edition


Project #5 - Hyper chunking
###
Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1262834


Nothing new for this edition


Project #6 - Run Web platform tests from the source checkout

Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1286900


Feature blocked on misconfigured EBS volumes in AWS (bug 1305174)
--
Zambrano Gasparnian, Armen
Engineering productivity
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Usability improvements for Firefox automation initiative - Status update #7

2016-10-12 Thread Armen Zambrano G.

On this update we will look at the progress made in the last two weeks.

A reminder that this quarter’s main focus is on:
* Debugging tests on interactive workers (only Linux on TaskCluster)
* Improve end to end times on Try (Thunder Try project)

For all bugs and priorities you can check out the project management 
page for it:

https://wiki.mozilla.org/EngineeringProductivity/Projects/Debugging_UX_improvements

Status update:
Debugging tests on interactive workers
---
Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1262260

Upcoming:
* Android xpcshell
* Blog/newsgroup post


Thunder Try - Improve end to end times on try
-

Project #1 - Artifact builds on automation
##
Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1284882

Accomplished recently:
* The following platforms are now supported: linux, linux64, macosx64, 
win32, win64
* An option was added to download symbols for our compiled artifacts 
during the artifact build


Upcoming:
* Debug artifact builds on try. (Right now --artifact always results in 
an opt artifact build.)

* Android artifact builds on try, thanks to nalexander.

Project #2 - S3 Cloud Compiler Cache

Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1280641

Some of the issues found last quarter for this project was around NSS 
which also was in need of replacing. This project was put on hold until 
the NSS work was completed. We’re going to resume this for Q4.


Project #3 - Metrics

Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1286856

Accomplished recently:
* Brittle running example here: 
http://people.mozilla.org/~klahnakoski/temp/End-to-End.html
* Problem with low populations; 90th percentile is effectively 
everything, and a couple of outliers impacts the End-to-End time shown.


Upcoming:
* Figure out what to do with these small populations:
* Ignore them - too small to be statistically significant
* Aggregate them - All the rarely run suites can be pushed into a 
“Other” category

* Show some other statistic:  Maybe median is better?
* Show median of past day, and 90% for the week:  That can show the 
longer trend, and short term situation, for better overall feel.


Project #4 - Build automation improvements
##
Accomplished recently:
* Bug 1306167 - Updated build machines to use SSD. Linux PGO builds now 
take half the time


https://bugzilla.mozilla.org/show_bug.cgi?id=1306167#c11

Project #5 - Run Web platform tests from the source checkout

Nothing to add on this edition.

--
Zambrano Gasparnian, Armen
Engineering productivity
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Usability improvements for Firefox automation initiative - Status update #5

2016-09-13 Thread Armen Zambrano G.

On this update we will look at the progress made in the last two weeks.

A reminder that this quarter’s main focus is on:
* Debugging tests on interactive workers (only Linux on TaskCluster)
* Improve end to end times on Try (Thunder Try project)

For all bugs and priorities you can check out the project management 
page for it: 
https://wiki.mozilla.org/EngineeringProductivity/Projects/Debugging_UX_improvements


Status update:

Debugging tests on interactive workers
---
Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1262260

Accomplished recently:
* Support for Android mochitests on interactive loaners
* Fixed race condition when waiting for loaner to be ready

Upcoming:
* Support for Android reftests on interactive loaners
* Support for Android xpcshell on interactive loaners


Thunder Try - Improve end to end times on try
-

Project #1 - Artifact builds on automation
##
Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1284882

Accomplished recently:
* Landed --artifact try flag for forcing artifact builds (linux64 only 
for now)


Upcoming:
* Restrict --artifact try flag to requests that don’t involve 
compiled-code tests


Project #2 - S3 Cloud Compiler Cache

Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1280641

Most issues on the sccache rewrite have been ironed out. Another test on 
try is required to validate it.


Project #3 - Metrics

Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1286856

Upcoming:
* Working on showing the high level (90th? percentile), “end-to-end” 
times. The hope is to reveal the longest running chain of jobs and 
provide a drill-down to the steps that explain the numbers. This 
particular statistic may be dubious, however, it provides something to 
build an interactive UI for other statistics.


Other
#
Bug 1272083 - Downloading/unzipping to be performed in memory
* The initial patch landed last week
* Unfortunately few intermittents were introduced due to a bug on the code
* The new patch landed today and should clear up the intermittents 
introduced


[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1290282

--
Zambrano Gasparnian, Armen
Engineering productivity
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Usability improvements for Firefox automation initiative - Status update #4

2016-08-30 Thread Armen Zambrano
Hello,
On this update we will look at the progress made in the last two weeks.

A reminder that this quarter’s main focus is on:
Debugging tests on interactive workers (only Linux on TaskCluster)
Improve end to end times on Try (Thunder Try project)

For all bugs and priorities you can check out the project management
page for it:
https://wiki.mozilla.org/EngineeringProductivity/Projects/Debugging_UX_improvements

Status update:
Debugging tests on interactive workers
---
Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1262260

Accomplished recently:
* Landed support for |mach marionette-test| on loaner
* Fixed xpcshell bug where specifying a path didn’t work

Upcoming:
* Support for Android jobs (starting with mochitest)


Thunder Try - Improve end to end times on try
-

Project #1 - Artifact builds on automation
##
Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1284882

Accomplished recently:
* Implemented (not yet landed) ‘-artifact’ flag in try syntax which
replaces full build step with artifact build. Desktop Linux64 working so
far; will land once also working on Win/OS X.

Upcoming:
* Restrict -artifact try flag to requests that don’t involve
compiled-code tests

Project #2 - S3 Cloud Compiler Cache

Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1280641

Nothing new for this edition.

Project #3 - Metrics

Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1286856

Accomplished recently:
* Working on Infraherder RFC, incorporating comments and feedback

Upcoming:
* Working on expanding prototype to work dynamically with more repositories
* Experimenting with using ActiveData to query for data
* Scope work based on Infraherder RFC

Other
#
Bug 1272083 - Downloading/unzipping to be performed in memory
* Having issues on e10s Win7 crashtests with
EXCEPTION_ACCESS_VIOLATION_WRITE

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1290282
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Report your development frustrations via `mach rage`

2016-08-08 Thread Armen Zambrano G.

I agree with the last suggestion. Words and context matter.

On 2016-08-08 04:25 PM, Jim Blandy wrote:

 LOL, but honestly, one of the ways I get myself to treat people better is
to avoid the whole rage / tableflip / flame vocabulary when thinking about
what I want to do. Could we publicize this as "mach gripe", and leave
"rage" as an alias?

On Mon, Aug 8, 2016 at 10:51 AM, Gregory Szorc  wrote:


Sometimes when hacking on Firefox/Gecko you experience something that irks
you. But filing a bug isn't appropriate or could be time consuming. You
instead vent frustrations on IRC, with others around the figurative water
cooler, or - even worse - you don't tell anyone.

The Developer Productivity Team would like to know when you have a bad
experience hacking on Firefox/Gecko so we can target what to improve to
make your work experience better and more productive.

If you update to the latest commit on mozilla-central, you'll find a new
mach command: `mach rage`.

`mach rage` opens a small web form where you can quickly express
frustration about anything related to developing Firefox/Gecko. It asks a
few simple questions:

* Where did you encounter a problem
* How severe was it
* What was it
* (optional) How can we fix it

If you don't want to use `mach rage`, just load
https://docs.google.com/forms/d/e/1FAIpQLSeDVC3IXJu5d33Hp_
ZTCOw06xEUiYH1pBjAqJ1g_y63sO2vvA/viewform

I encourage developers to vent as many frustrations as possible. Think of
each form submission as a vote. The more times we see a particular pain
point mentioned, the higher the chances something will be done about it.

This form is brand new, so if you have suggestions for improving it, we'd
love to hear them. Send feedback to g...@mozilla.com and/or
jgrif...@mozilla.com.

Happy raging.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform




--
Zambrano Gasparnian, Armen
Engineering productivity
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Usability improvements for Firefox automation initiative - Status update #2

2016-08-02 Thread Armen Zambrano G.

On this update, we will look at the progress made since our initial update.

A reminder that this quarter’s main focus is on:
* Debugging tests on interactive workers (only Linux on TaskCluster)
* Improve end to end times on Try (Thunder Try project)

For all bugs and priorities you can check out the project management 
page for it:

https://wiki.mozilla.org/EngineeringProductivity/Projects/Debugging_UX_improvements



Debugging tests on interactive workers
---
Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1262260

Accomplished recently:
* Bug 1285582 - Fixed Xvfb startup issue
* Bug 1288827 - Improved mochitest UX (no longer need --appname, paths 
normalized)

* Bug 1289879 - Uses mozharness venv if available

Upcoming:
* Support for smaller test harnesses (Cpp, Mn, wpt, etc)
* Improved one-click-loaner UX

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1285582
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=1288827
[3] https://bugzilla.mozilla.org/show_bug.cgi?id=1289879



Thunder Try - Improve end to end times on try
-

Project #1 - Artifact builds on automation
##
Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1284882

No news for this edition and probably the next one.


Project #2 - S3 Cloud Compiler Cache

Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1280641

* Working on testing sccache re-write on Try
* More news on following update


Project #3 - Metrics

Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1286856

Accomplished recently:
* Bug 1242017 - Metrics team will configure ingestion point into Telemetry

Upcoming:
* Bug 1258861 - Working on underlying data model at the moment:

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1242017
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=1258861

Other
#
* Bug 1287604 - Experiment with different AWS instance types for TC 
linux64 builds
  * Some initial experiments have shown we can shave 20 minutes off an 
average linux64 build by using more powerful AWS instances, with a 
reasonable cost tradeoff. We’ll start the work of migrating to these new 
instances soon.
* Bug 1272083 - Downloading and unzipping should be performed as data is 
received
  * Project for investigation has been started: 
https://github.com/armenzg/download_and_unpack

* Bug 1286336 - Improve interaction of automation with version control
  * Buildbot AMIs now seeded with mozilla-unified repo (Bug 1232442)
  * TaskCluster decision and various lint/test tasks now use `hg 
robustcheckout` and share caches more optimally (Bug 1247168)

* Flake8 tasks now complete in as little as 9s (~3m before)
* Decision tasks now complete in <60s on average
  * Some TaskCluster tasks now share VCS checkouts on Try (Bug 1289643)
* Tasks will complete faster on Try due to not having to perform 
full VCS checkout


[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1287604
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=1272083
[3] https://bugzilla.mozilla.org/show_bug.cgi?id=1247168

--
Zambrano Gasparnian, Armen
Engineering productivity
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Adding new jobs to pushes is not currently working

2016-07-19 Thread Armen Zambrano G.

The Pulse infra issue has been resolved and we're now back to business.

On 2016-07-18 10:38 AM, Armen Zambrano G. wrote:

Hello,
Treeherder sends a Pulse messages to add new jobs to your pushes. Those
messages are currently not arriving, thus, the system won't work.

If you want to know when it gets fixed you can follow this bug:
https://bugzilla.mozilla.org/show_bug.cgi?id=1287404

Our apologies for any inconveniences this may cause you.

regards,
Armen

---
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca



--
Zambrano Gasparnian, Armen
Engineering prodctuvity
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Usability improvements for Firefox automation initiative - Status update #1

2016-07-19 Thread Armen Zambrano G.
The developer survey conducted by Engineering Productivity last fall 
indicated that debugging test failures that are reported by automation 
is a significant frustration for many developers. In fact, it was the 
biggest deficit identified by the survey. As a result,
the Engineering Productivity Team (aka A-Team) is working on improving 
the user experience for debugging test failures in our continuous 
integration and speeding up the turnaround for Try server jobs.


This quarter’s main focus is on:
* Debugging tests on interactive workers (only Linux on TaskCluster)
* Improve end to end times on Try (Thunder Try project)

For all bugs and priorities you can check out the project management 
page for it:

https://wiki.mozilla.org/EngineeringProductivity/Projects/Debugging_UX_improvements

In this email you will find the progress we’ve made recently. In future 
updates you will see a delta from this email.


PS = These status updates will be fortnightly


Debugging tests on interactive workers
---
Accomplished recently:
* Landed support for running reftest and xpcshell via tests.zip
* Many UX improvements to the interactive loaner workflow

Upcoming:
* Make sure Xvfb is running so you can actually run the tests!
* Mochitest support + all other harnesses


Thunder Try - Improve end to end times on try


Project #1 - Artifact builds on automation
##
Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1284882

Accomplished recently:
* Landed prerequisites for Windows and OS X artifact builds on try.
* Identified which tests should be skipped with artifact builds

Upcoming:
* Provide a try syntax flag to trigger only artifact builds instead of 
full builds; starting with opt Linux 64.



Project #2 - S3 Cloud Compiler Cache

Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1280641

Accomplished recently:
* Sccache’s Rust re-write has reached feature parity with Python’s sccache
* Now testing sccache2 on Try

Upcoming:
* We want to roll out a two-tier sccache for Try, which will enable it 
to benefit from cache objects from integration branches



Project #3 - Metrics

Tracking bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1286856

Accomplished recently:
* Preliminary analytics / research based on job data from Treeherder 
found at: 
http://nbviewer.jupyter.org/url/people.mozilla.org/%7Ewlachance/try%20analysis.ipynb

  * Which jobs finish last?
  * Which jobs have the highest wait times?
  * Which jobs have the longest total wall clock time (i.e. are the 
largest consumers of resources)


Upcoming:
* Putting Mozharness steps’ data inside Treeherder’s database for 
aggregate analysis


Other
#
Upcoming:
* TaskCluster Linux builds are currently built using a mix of m3/r3/c3 
2xlarge AWS instances, depending on pricing and availability. We’re 
going to be looking to assess the effects on build speeds of using more 
powerful AWS instances types, as one potential way of reducing e2e Try 
times.

  * https://bugzilla.mozilla.org/show_bug.cgi?id=1287604


--
Zambrano Gasparnian, Armen
Engineering prodctuvity
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Adding new jobs to pushes is not currently working

2016-07-18 Thread Armen Zambrano G.

Hello,
Treeherder sends a Pulse messages to add new jobs to your pushes. Those 
messages are currently not arriving, thus, the system won't work.


If you want to know when it gets fixed you can follow this bug:
https://bugzilla.mozilla.org/show_bug.cgi?id=1287404

Our apologies for any inconveniences this may cause you.

regards,
Armen

---
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Making try faster for debugging intermittents

2016-07-13 Thread Armen Zambrano G.

On 2016-07-01 06:10 AM, Mike Hommey wrote:

On Fri, Jul 01, 2016 at 03:07:43PM +1000, Xidorn Quan wrote:

On Fri, Jul 1, 2016 at 1:02 PM, Karl Tomlinson <mozn...@karlt.net> wrote:


William Lachance writes:


As part of a larger effort to improve the experience around
debugging intermittents, I've been looking at reducing the time it
takes for common "try" workloads for developers (so that
e.g. retriggering a job to reproduce a failure can happen faster).



Also, accounts of specific try workloads of this type which are
annoying/painful would be helpful. :) I think I have a rough idea
of the particular type of try push I'm looking for (not pushed by
release operations, at least one retrigger) but it would be great
to get firsthand confirmation of that.


One thing that might be helpful is enabling running only tests on
try with a designated build that has already been created.

Often tests are modified to add logging, after which the same
build could be run with the new version of the test, thus saving
waiting for a build.



FWIW, there's a bug about this:
https://bugzilla.mozilla.org/show_bug.cgi?id=1240644


You can already run tests with arbitrary test tarballs (that you could
create locally), but I can't find where it's documented, which may
explain why it's not well known.

(CCing Armen, who would know)

Mike



I totally missed this.

It requires checking out mozci and using the trigger.py script (only 
Buildbot - old CI).

We want to teach |mach try| to help developers with this case scenario.
It would work with both Buildbot and TaskCluster (new CI).

regards,
Armen
--
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Adding jobs to Treeherder will now show a 'Sch' job with the outcome of the request

2016-06-30 Thread Armen Zambrano G.

Hello all,
Until now, adding jobs to Treeherder was completely opaque.
As of today you will see a "Sch" job start running soon after you make 
your request.


You will have access to logs and a link to file a bugs. Filing bugs will 
help when my alerting system does not catch issues.


I have all information and screenshots in here:
https://armenzg.blogspot.ca/2016/06/adding-new-jobs-to-treeherder-is-now.html

Please let me know if you find any bugs.

Happy hacking,
Armen

--
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Tier-1 for Linux 64 Debug builds in TaskCluster on March 14 (and more!)

2016-05-20 Thread Armen Zambrano G.

On 2016-05-19 08:29 PM, Mike Hommey wrote:

On Thu, May 19, 2016 at 07:20:30PM -0500, J. Ryan Stinnett wrote:

On Thu, May 19, 2016 at 6:24 PM, Xidorn Quan  wrote:

2. I cannot retrigger any TC task.

This is pretty annoying when I was debugging intermittent issues. Hopefully
they could get fix before we migrate all Linux builds to TaskCluster,
otherwise we will lose the ability to debug certain kinds of bugs with
Linux builds.


Other people have also mentioned that they can't retrigger these TC
tasks, so this part sounds accurate. Bug 1274176 was filed recently
about this. I hope it will be prioritized, I agree it's frustrating
when working on intermittents.


It's also not possible to *trigger* new TC jobs on treeherder ; like,
pushing with no try syntax and filling what you want with "Add new
jobs". Or using "Add new job" after realizing you forgot a job in your
try syntax.

Mike



martianwars is working on it:
https://bugzilla.mozilla.org/show_bug.cgi?id=1254325

I believe we should have it by the end of June.

--
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Two talos jobs for PGO builds on fx-team and m-i

2016-05-12 Thread Armen Zambrano G.
My apologies. I looked at the "default" value rather than the project 
specific cadence.


https://dxr.mozilla.org/build-central/source/buildbot-configs/mozilla/project_branches.py#20

On 2016-05-12 11:12 AM, Ryan VanderMeulen wrote:

On 5/12/2016 10:51 AM, Armen Zambrano G. wrote:

IIUC, we have PGO builds running on fx-team and m-i every 6 hours.
This only affects Linux x64 and Windows (not Mac).
The test pools are shared between all repositories.

This entails 14 extra jobs for Windows XP and Windows 8.
This entails 16 extra jobs for Windows 7.

This is equivalent to two more try push every 6 hours (fx-team and m-i
are separated by an hour) that only requests talos jobs for Linux x64
and Windows. No build cost.

IMHO the cost seems minimal, however, I will let jmaher weigh in here.

regards,
Armen

On 2016-05-12 10:16 AM, Gregory Szorc wrote:

What impact will this have on machine capacity? The Windows and Mac
testers are already highly overwhelmed. Try jobs are often delayed by
several hours, which I think is a major concern.

(I can't remember if we have a separate pool for Talos testers [on
Try].)


On May 12, 2016, at 05:01, Armen Zambrano G. <arme...@mozilla.com>
wrote:

Hello team,
We're now scheduling two talos jobs for every PGO build on fx-team
[1] and m-i.

This is to help reduce the time it takes to find PGO specific
regressions for these two branches.

If you see any issues falling out of this please file it under
Testing::General and CC me.

This is work done by our contributor martianwars.

regards,
Armen

[1]
https://treeherder.mozilla.org/#/jobs?repo=fx-team=8cf323be5c58b28d8719401ebb0ef63f1d71d000=pgo%20talos_state=expanded



--
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform




PGO build frequency is 3 hours.



--
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Two talos jobs for PGO builds on fx-team and m-i

2016-05-12 Thread Armen Zambrano G.

IIUC, we have PGO builds running on fx-team and m-i every 6 hours.
This only affects Linux x64 and Windows (not Mac).
The test pools are shared between all repositories.

This entails 14 extra jobs for Windows XP and Windows 8.
This entails 16 extra jobs for Windows 7.

This is equivalent to two more try push every 6 hours (fx-team and m-i 
are separated by an hour) that only requests talos jobs for Linux x64 
and Windows. No build cost.


IMHO the cost seems minimal, however, I will let jmaher weigh in here.

regards,
Armen

On 2016-05-12 10:16 AM, Gregory Szorc wrote:

What impact will this have on machine capacity? The Windows and Mac testers are 
already highly overwhelmed. Try jobs are often delayed by several hours, which 
I think is a major concern.

(I can't remember if we have a separate pool for Talos testers [on Try].)


On May 12, 2016, at 05:01, Armen Zambrano G. <arme...@mozilla.com> wrote:

Hello team,
We're now scheduling two talos jobs for every PGO build on fx-team [1] and m-i.

This is to help reduce the time it takes to find PGO specific regressions for 
these two branches.

If you see any issues falling out of this please file it under Testing::General 
and CC me.

This is work done by our contributor martianwars.

regards,
Armen

[1] 
https://treeherder.mozilla.org/#/jobs?repo=fx-team=8cf323be5c58b28d8719401ebb0ef63f1d71d000=pgo%20talos_state=expanded

--
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform



--
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Two talos jobs for PGO builds on fx-team and m-i

2016-05-12 Thread Armen Zambrano G.

Hello team,
We're now scheduling two talos jobs for every PGO build on fx-team [1] 
and m-i.


This is to help reduce the time it takes to find PGO specific 
regressions for these two branches.


If you see any issues falling out of this please file it under 
Testing::General and CC me.


This is work done by our contributor martianwars.

regards,
Armen

[1] 
https://treeherder.mozilla.org/#/jobs?repo=fx-team=8cf323be5c58b28d8719401ebb0ef63f1d71d000=pgo%20talos_state=expanded


--
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: PSA: Cancel your old Try pushes

2016-04-28 Thread Armen Zambrano G.

On 2016-04-26 05:54 PM, Mike Hommey wrote:

On Tue, Apr 26, 2016 at 03:49:11PM +0200, Gabor Krizsanits wrote:

As someone who was high on the list of try server usage for two
weeks  My problem was a test I tried to fix for both e10s and
non-e10s, and it timed out _sometimes_ on _some_ platforms even
depending on debug/release build. It was a whack-a-mole game by
fiddling with the test and a complex patch. I did stop old builds but
I did not run only the test in question but the rest of them as well
because of the invasive nature of the patch the whole thing was
sitting on. Probably I could have been smarter, BUT...

What would have helped me a lot in this case and most cases when I
rely on the try server is the ability to push a new changeset on top
of my previous one, and tell the server to use the previous session
instead of a full rebuild (if there is only a change in the tests
that's even better, no rebuild at all) and then tell the server
exactly which tests I want to re-run with those changes (as it can be
an empty set this can be used to trigger additional tests for a
previous push). This could all be done by an extensions to the try
syntax like -continue [hash]. As an addition this follow up push would
also kill the previous job.

Maybe there is already such functionality available, just I'm not
aware of it (I would be so happy if this were the case, and would feel
bad for the machine hours I wasted...), if so please let me know.


You can do that more or less with moz-ci. IIRC, the setup is detailed
somewhere on Armen's blog (CCed, he might be able to point you there)

Mike



It is possible for Buildbot jobs (not TaskCluster) with a python script, 
you need to specify where to find the builds and test bundles are [1]


However, this is not optimal as it is.
You need to upload the new test bundle somewhere and point to that.

I've filed this as https://bugzilla.mozilla.org/show_bug.cgi?id=1268481
and tracking it under making try awesome meta bug.


[1] 
https://github.com/mozilla/mozilla_ci_tools/blob/master/scripts/trigger.py#L87


--
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Adding new jobs to treeherder does not work for Linux64 debug jobs

2016-04-22 Thread Armen Zambrano G.

Hello team,
Recently we hid the Linux64 debug jobs that run on Buildbot since we 
have the TaskCluster version covering that.


Unfortunately, you won't be able to add Linux64 debug jobs *even* if 
Treeherder shows them to you (they will be scheduled but hidden).


We will have someone working on this in May and I will also have more 
cycles to fix this and other issues (e.g. lack of feedback on how the 
request went).


My apologies that we've not had time to work on this until now.

regards,
Armen

--
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Build System Project - Update from the last 2 weeks

2016-04-20 Thread Armen Zambrano G.

Do we know what happen with the data points of the end of the graph?

https://treeherder.mozilla.org/perf.html#/graphs?timerange=2592000=%5Bmozilla-inbound,04b9f1fd5577b40a555696555084e68a4ed2c28f,1%5D=%5Bmozilla-inbound,65e0ddb3dc085864cbee77ab034dead6323a1ce6,1%5D=%5Bmozilla-inbound,c0018285639940579da345da71bb7131d372c41e,1%5D=1460493349005.865,1461176501000,0,1

If I read it correctly it feels that we're having this week less pgo 
builds (2/day versus 9-10/day)


This link without opt makes it more obvious:
https://treeherder.mozilla.org/perf.html#/graphs?series=%5Bmozilla-inbound,c0018285639940579da345da71bb7131d372c41e,1%5D=%5Bmozilla-inbound,04b9f1fd5577b40a555696555084e68a4ed2c28f,0%5D=%5Bmozilla-inbound,65e0ddb3dc085864cbee77ab034dead6323a1ce6,1%5D

regards,
Armen


On 16-04-20 11:44 AM, Gregory Szorc wrote:




On Apr 20, 2016, at 08:16, Nicolas B. Pierron  
wrote:

Unrelated, Do we have news for clang blockers on Windows?  In particular, I am 
thinking about the various Sanitizers.


We haven't really talked about Clang on Windows in our build meeting/plannings. 
That's not to say someone else hasn't been working on it. But I haven't seen 
much bug traffic indicating that's the case.

If you make a case for Clang on Windows improving developer productivity or 
improving stability, that's how you get something prioritized. Can you start a 
thread on dev-builds listing the benefits?




On 04/20/2016 11:00 AM, David Burns wrote:
We have also started looking at how we can use a global compiler cache on
local builds and not just in automation. This will allow artifact-like
builds for those who are doing C++ development.


I am particularly scared about this topic for multiple reasons, including that 
my system does not have a /lib directory.  Is there a location where I can 
learn more and contribute back to this?

--
Nicolas B. Pierron
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform



--
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Moving XP to ESR?

2016-04-20 Thread Armen Zambrano G.

Would it make more sense to have a relbranch instead of using ESR?
IIRC ESRs are stable for a period but when we uplift we uplift 
everything new.

For this XP relbranch we would only take security patches.

It would serve the purpose of keeping our users secure where they're but 
saving some energy in making new features also XP compatible.


Setting an N months EOL expectation (plus another criteria[s]) might be 
wise rather than leaving it open ended.


regards,
Armen

On 16-04-20 11:46 AM, Henri Sivonen wrote:

On Mon, Apr 18, 2016 at 4:46 PM, Thomas Zimmermann
 wrote:

And XP still runs on ~10% of all desktops. That's an opportunity to
convert some of the users to Firefox.


This assumes that

1) users who are still on XP still make active browser choices
2) ESR wouldn't be good enough to for these users
3) XP will still run ~10% of desktops in 11 months.

(FWIW, StatCounter puts XP's Web usage share of desktop closer to 7% than 10%.)

On Mon, Apr 18, 2016 at 7:56 PM, Milan Sreckovic  wrote:

What’s the “XP tax”?


It's our attention being diverted to being backward-looking instead of
forward-looking by a thousand cuts. Here are some examples off the top
of my head:

  * We don't have EME-style DRM on XP, but if we hadn't even tried to
accommodate XP, we could have avoided some grief. (For obvious
reasons, I'm not going to elaborate on this on this list.)

  * The Rust team has had to do extra work to support XP, since XP is a
Firefox product requirement.

  * Lack of SSE2, though not an XP problem per se, coincides with XP,
so we could just require SSE2 if we didn't support XP.

  * XP failing to preserve register state on newer CPUs caused an
investigation like
https://bugzilla.mozilla.org/show_bug.cgi?id=1263495#c13

Obviously, none of the above alone seems decisive, but those are just
a few recent things that I can think of without searching. I'm sure
there are a lots and lots of things, each smallish taken alone, but
they add up and take our collective attention away from making our
product better on current systems. Moving XP to ESR would liberate us
from thinking of some of them, but, granted, we might feel compelled
to figure out stuff like the AVX thing even on ESR. Also, some of the
above are sunk cost now, but my point is that as long as XP is treated
as supported, it can inflict new analogous costs on us.

On Tue, Apr 19, 2016 at 11:24 PM, Kyle Huey  wrote:

We jump through some hoops to support things like Linux and Mac too, and
those platforms combined have far fewer users than XP.


Linux and Mac will still be as relevant in 11 months. XP's relevance
is declining. If our estimate was that XP won't be worthwhile in 11
months, putting it on ESR now would make sense compared to expending
the effort of full support over the next 11 months.




--
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Try syntax is no longer required when pushing to try

2016-03-04 Thread Armen Zambrano G.
On 16-03-03 08:35 PM, Xidorn Quan wrote:
> On Fri, Mar 4, 2016 at 1:25 AM, Andrew Halberstadt <ahalberst...@mozilla.com
>> wrote:
> 
>> With treeherder's "Add New Jobs" [1] UI, using try syntax is no longer
>> the only way to schedule stuff on try. As of now, it's possible to push
>> to try without any try syntax. If you do this, no jobs will be scheduled
>> on your push and it will be up to you to add whichever jobs you want via
>> "Add New Jobs". Please note, it's not currently possible to schedule
>> taskcluster jobs this way. Taskcluster support will be tracked by bug
>> 1246167 [2].
>>
> 
> Great job!
> 
> I hope it could support using shift key to select a range of jobs, just
> like what I did for trychooser in bug 1217283
> <https://bugzilla.mozilla.org/show_bug.cgi?id=1217283>. It would make
> things much convenient.
> 
I will file this and get some contributors to look into it.

> Also I'd suggest it support adding jobs via parsing try syntax, which
> should make triggering a same set of jobs among multiple platforms easier.
> 
Perhaps a prompt that allows you to include and exclude patterns and see
live what gets added might be more intuitive.

I will file a bug and see who can pick it up.

Thanks for the suggestions!

> - Xidorn
> 

Armen

-- 
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Try syntax is no longer required when pushing to try

2016-03-04 Thread Armen Zambrano G.
On 16-03-04 01:22 AM, Mike Hommey wrote:
> On Thu, Mar 03, 2016 at 12:25:53PM -0500, Andrew Halberstadt wrote:
>> With treeherder's "Add New Jobs" [1] UI, using try syntax is no longer
>> the only way to schedule stuff on try. As of now, it's possible to push
>> to try without any try syntax. If you do this, no jobs will be scheduled
>> on your push and it will be up to you to add whichever jobs you want via
>> "Add New Jobs". Please note, it's not currently possible to schedule
>> taskcluster jobs this way. Taskcluster support will be tracked by bug
>> 1246167 [2].
> 
> Haven't tested whether it works, but it appears to also be possible to
> trigger PGO jobs with "Add New Jobs", which the try syntax didn't allow
> to do (it required going through hoops with a mozconfig)
> 
> Mike
> 

The way that Buildbot configs work is that it might add builders yet
there may not be a condition to trigger them. In the case of PGO,
they're time based thus not being triggered through the try syntax.

This is a very good finding! Please let us know what you discover.
This might become the new path to scheduling them.

If it works, I can start watching for pgo in the try syntax.

regards,
Armen

-- 
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: To bump mochitest's timeout from 45 seconds to 90 seconds

2016-02-12 Thread Armen Zambrano G.
On 16-02-10 04:58 AM, Marco Bonardo wrote:
> On Wed, Feb 10, 2016 at 10:30 AM, James Graham 
> wrote:
> 
>> FWIW I think it's closer to the truth to say that these tests are not set
>> up to be performance regression tests
>>
> 
> Right, but this was just one of the aspects that was pointed out. I think
> the performance analysis is not much about catching regressions (even if
> that can effectively happen), rather figuring why a test takes 90 seconds
> instead of 10, is that a problem that may affect end users too?
> The larger we set the threshold, the less people are likely to investigate
> the reasons the test takes so much, cause there will be no bugs filed about
> the problem. It will just go unnoticed.
> Other reasons are bound to costs imo. Developers time is a cost, if
> everyone starts writing tests taking more time, cause now the timeout is
> much larger, we may double the time to run tests locally or to get results
> out of Try.
> Finally, bumping timeouts isn't a final solution, that means in a few years
> we may be back rediscussing another bump.
> 
> I'm not saying bumping the timeout is wrong, I'm saying we should also
> evaluate a long term strategy to avoid the downsides. Maybe we should have
> something like the orange factor tracking tests runtime (in all the
> harnesses) and automatically filing bugs in the appropriate component when
> some test goes over a threshold?
> 

In this case it would be better to have a tool that keeps track of test
runtimes and report regressions.
ActiveData has the data to do this.


-- 
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: To bump mochitest's timeout from 45 seconds to 90 seconds

2016-02-12 Thread Armen Zambrano G.
On 16-02-09 01:32 PM, Daniel Holbert wrote:
> Just to clarify, you're *only* talking about browser-chrome mochitests
> here, correct?  (not other mochitest suites like mochitest-plain)
> 
> (It looks like this is the case, based on the bug, but your dev.platform
> post here made it sound like this change affected all mochitests.)
> 
> Thanks,
> ~Daniel
I'm interested in browser-chrome. Is there a different variable for
those tests?

> 
> On 02/08/2016 02:51 PM, Armen Zambrano G. wrote:
>> Hello,
>> In order to help us have less timeouts when running mochitests under
>> docker, we've decided to double mochitests' gTimeoutSeconds and reduce
>> large multipliers in half.
>>
>> Here's the patch if you're curious:
>> https://bugzilla.mozilla.org/page.cgi?id=splinter.html=1246152=8717111
>>
>> If you have any comments or concerns please raise them in the bug.
>>
>> regards,
>> Armen
>>


-- 
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: To bump mochitest's timeout from 45 seconds to 90 seconds

2016-02-09 Thread Armen Zambrano G.
I will try 60 seconds and see how it goes.

On 16-02-09 05:47 AM, Marco Bonardo wrote:
> 90 seconds for a simple test sounds like a lot of time and a huge bump from
> the current situation (45).
> The risk is people will start writing much bigger tests instead of
> splitting them into smaller an more manageable tests. Plus when a test
> depends on a long timeout in the product, developers are used to figure out
> ways to reduce those (through hidden prefs or such) so that test can finish
> sooner and not timeout.
> Based on that, bumping the timeout may have 2 downsides, long term:
> - slower tests for everyone
> - sooner or later 90 seconds won't be enough again. Are we going to bump to
> 180 then?
> 
> I think that's the main reason the default timeout was set to a low value,
> while still allowing the multipliers as a special case for tests that
> really require bigger times, cause there's no other way out.
> 
> Is docker doubling the time for every test? From the bug looks like it may
> add 20-30% of overhead, so why are we not bumping the timeout of 30% (let's
> say 60s) and investigating the original cause (the bug that takes 80s to
> run) to figure if something can be done to make it finish sooner?
> 
> -m
> 
> 
> On Mon, Feb 8, 2016 at 11:51 PM, Armen Zambrano G. <arme...@mozilla.com>
> wrote:
> 
>> Hello,
>> In order to help us have less timeouts when running mochitests under
>> docker, we've decided to double mochitests' gTimeoutSeconds and reduce
>> large multipliers in half.
>>
>> Here's the patch if you're curious:
>>
>> https://bugzilla.mozilla.org/page.cgi?id=splinter.html=1246152=8717111
>>
>> If you have any comments or concerns please raise them in the bug.
>>
>> regards,
>> Armen
>>
>> --
>> Zambrano Gasparnian, Armen
>> Automation & Tools Engineer
>> http://armenzg.blogspot.ca
>> ___
>> dev-platform mailing list
>> dev-platform@lists.mozilla.org
>> https://lists.mozilla.org/listinfo/dev-platform
>>


-- 
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


To bump mochitest's timeout from 45 seconds to 90 seconds

2016-02-08 Thread Armen Zambrano G.
Hello,
In order to help us have less timeouts when running mochitests under
docker, we've decided to double mochitests' gTimeoutSeconds and reduce
large multipliers in half.

Here's the patch if you're curious:
https://bugzilla.mozilla.org/page.cgi?id=splinter.html=1246152=8717111

If you have any comments or concerns please raise them in the bug.

regards,
Armen

-- 
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Use --interactive in try syntax to SSH into docker image

2016-02-02 Thread Armen Zambrano G.
Hello,
This was demonstrated at Mozlando by gardnt and jonasfj.

To use it:
* add --interactive to your try syntax
* wait for the task to start running
* click on "Inspect task" hyperlink
 * bottom left of page when you click a job on Treeherder
* click on 'private/docker-worker/shell.html' under artifacts
* wait a bit until you see a prompt

Notice, the actual job won't be executing. It simply helps you inspect
what is installed in the docker image.

At the moment, I've only tried this with a build docker image.

regards,
Armen

---
INTERACTIVE
Allows ssh-like access to running containers. Will extend the lifetime
of a task to allow a user to SSH in before the container dies, so be
careful when using this feature. Will also keep the task alive while is
connected and a little bit after that so a user can keep working in ssh
after the task ends.
---

The slides (--interactive is mentioned 3/4 through the deck) :
http://docs.taskcluster.net/presentations/lightning-talk/#/interactive

-- 
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Use --interactive in try syntax to SSH into docker image

2016-02-02 Thread Armen Zambrano G.
Three things more:
* This works for test jobs as well
* You can edit a task and resubmit it by adding
task.payload.features.interactive to true
* You can also click on "private/docker-worker/display.html" and get VNC
access over the browser.
 * You can actually see the browser running the tests!

Known bug: If you use --interactive your push will show up as green (bug
1232385)


On 16-02-02 02:27 PM, Armen Zambrano G. wrote:
> Hello,
> This was demonstrated at Mozlando by gardnt and jonasfj.
> 
> To use it:
> * add --interactive to your try syntax
> * wait for the task to start running
> * click on "Inspect task" hyperlink
>  * bottom left of page when you click a job on Treeherder
> * click on 'private/docker-worker/shell.html' under artifacts
> * wait a bit until you see a prompt
> 
> Notice, the actual job won't be executing. It simply helps you inspect
> what is installed in the docker image.
> 
> At the moment, I've only tried this with a build docker image.
> 
> regards,
> Armen
> 
> ---
> INTERACTIVE
> Allows ssh-like access to running containers. Will extend the lifetime
> of a task to allow a user to SSH in before the container dies, so be
> careful when using this feature. Will also keep the task alive while is
> connected and a little bit after that so a user can keep working in ssh
> after the task ends.
> ---
> 
> The slides (--interactive is mentioned 3/4 through the deck) :
> http://docs.taskcluster.net/presentations/lightning-talk/#/interactive
> 


-- 
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Shutting try extender off (please add jobs through Treeherder)

2016-01-28 Thread Armen Zambrano G.
I'm going to be shutting off try-extender [1] on February 5th since you
can accomplish the same via Treeherder ('add new jobs' button).

I've fixed all known bugs and I have bandwidth to help in case new
issues arise.

In the future, more features will come  to give you proper feedback when
things go wrong (instead of me being alerted).

regards,
Armen

[1] http://try-extender.herokuapp.com/
-- 
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Testing documentation (was Re: Where to put docker documentation?)

2016-01-25 Thread Armen Zambrano G.
Sure.

There's no clear structure of where something like this should live. I
thought a good starting point would be this:
https://developer.mozilla.org/en-US/docs/Mozilla/Testing

However, that doesn't seem to be a page that would take you to other
testing related matters.

Then there's this tag:
https://developer.mozilla.org/en-US/docs/tag/Automated%20testing

Then there's this page:
https://developer.mozilla.org/en-US/docs/Running_automated_tests

I wonder if any of you would be interested on fixing this a bit or at
least someone with whom I could brainstorm a new structure proposal.

regards,
Armen

On 16-01-22 02:23 PM, Mike Hoye wrote:
> On 2016-01-22 12:07 PM, Armen Zambrano G. wrote:
>> We're now running side by side Linux x64 debug test jobs inside of
>> docker on TaskCluster [1]
>>
>> Where could I add instructions on how to run jobs running inside of
>> docker? (mdn? the tree? else?)
> MDN please. Docker is going to going to be important for community
> participation in infrastructure, so MDN docs would be very helpful.
> 
> - mhoye


-- 
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Where to put docker documentation?

2016-01-22 Thread Armen Zambrano G.
We're now running side by side Linux x64 debug test jobs inside of
docker on TaskCluster [1]

Where could I add instructions on how to run jobs running inside of
docker? (mdn? the tree? else?)

regards,
Armen

[1]
https://treeherder.mozilla.org/#/jobs?repo=mozilla-central=tc%20linux%20x64%20debug_state=expanded


-- 
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Just Autoland It

2016-01-22 Thread Armen Zambrano G.
On 16-01-22 08:16 AM, Jared Wein wrote:
> If a patch only touches JS and CSS, could tryserver use the archive build
> implementation to generate faster try builds?
> 

We could do things like that (We can tell test jobs to grab the
installer and test packages from different locations).

regards,
Armen
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Adding new jobs to pushes in better shape

2016-01-21 Thread Armen Zambrano G.
I've spent the last two days fixing most issues affecting adding jobs on
Treeherder.
I believe it is now working properly.
You don't need to try multiple times.

Remember:
* this does not work with TaskCluster jobs [1]
* if you own the push, you can add jobs
* if you use and @mozilla.com, you have full privileges
* if you need to be whitelisted because you have a non @mozilla.com
address let me know

More improvements are needed [2] to give you proper feedback and make
this a production worthy system, however, it should work for now.
If it doesn't, please let me know (file bugs under Testing:General and
CC me).

regards,
Armen

[1]
https://treeherder.mozilla.org/#/jobs?repo=mozilla-central=tc
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=push_extender

-- 
Zambrano Gasparnian, Armen
Engineering productivity engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: FYI: e10s will be enabled in beta 44/45

2015-12-04 Thread Armen Zambrano G.
LastPass bring the browser to a crawl making it almost impossible to
use. If we have users using LastPass on the beta population using e10s
we're going to have a lot of people upset.
https://bugzilla.mozilla.org/show_bug.cgi?id=1008768


On 15-12-04 10:44 AM, Ehsan Akhgari wrote:
> On 2015-12-04 9:02 AM, jmath...@mozilla.com wrote:
>> Hey all,
>>
>> FYI e10s will be enabled in beta/44 in a limited way for a short
>> period time through an experiment. [1] The purpose of this experiment
>> is to collect e10s related performance measurements specific to beta.
>>
>> The current plan is to then enabled e10s in beta/45 for a respectable
>> chunk of our beta population. This population will *exclude* users who
>> have recently loaded accessibility and users who have a large number
>> of addons installed.
>>
>> If you know of serious e10s related bugs in your components that you
>> feel should be fixed for the beta/45 rollout please get those bugs
>> tracked for 45 and owned. If you have any issues or questions you can
>> tag the bug with tracking-e10s:? and the e10s team will come through
>> to help triage.
> 
> Does this mean that not all tracking-e10s+ bugs will need to be fixed
> before we ship e10s?  What's the indicator that a bug "blocks" shipping
> e10s?
> 


-- 
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


If you run mozharness on your local machine (tooltool changes)

2015-05-13 Thread Armen Zambrano G.
Hello all,
At the end of last year we made some changes to make mozharness more
easy to run outside of the Release Engineering VPN.
Mozharness is the python harness that runs most jobs that show up on
Treeherder.

Tooltool has been re-written recently [1], you will now need a token on
your machine (documented) [2].
Tooltool allows fetching binaries (public and private) into a machine,
it does verifications and caching if needed.

In general, if you try to run mozharness as a developer to troubleshoot
an issue and you start having trouble, file a bug (see docs' header)
with the details and if we can't help you on #ateam/#releng look into
requesting a loaner so you are not blocked (added note in docs).

Some jobs can be very dependent to the machines they're running on the
CI and can not easily be run outside of the intended setup. Where
possible we would like to fix this.

regards,
Armen

[1] http://code.v.igoro.us/posts/2015/04/new-tooltool-in-production.html
[2]
https://wiki.mozilla.org/ReleaseEngineering/Mozharness/How_to_run_tests_as_a_developer#Step_2_-_Create_a_tooltool_token_.28fetches_artifacts_for_you.29

-- 
Zambrano Gasparnian, Armen
Automation  Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: (mozci) Triggering jobs on treeherder

2015-04-27 Thread Armen Zambrano G.

Thank you all for reading through the post and the feedback.
It seems that there is interest for this.

On 15-04-24 06:06 PM, Mike Hommey wrote:

On Fri, Apr 24, 2015 at 10:26:58PM +0100, Gijs Kruitbosch wrote:

Are you going to build a web UI for this so I don't need to check out a repo
and run a python script with syntax that I'll likely need to look up every
time I want to do it, guessing builder names that I don't know?

(don't get me wrong, I could probably use it if I needed to, but it's harder
than it could be right now...)

If we allowed invoking through mach, it would also help with not needing 
to checkout an extra repo OR pip install a package on your system.


Yes, we could build a web UI to help with this process, however, at the 
same time I wanted to measure how many people would likely use it.



It seems to me like self-serve does a lot of this, but you can't retrigger
things that haven't run in the first place.




mozci allows you to trigger jobs that have not been scheduled.
It uses an API that self-serve/buildapi provides.
I know that the self-serve UI does not provide access to this specific 
API but others.



Why not have these features on treeherder?


Yes, we could; it is in the milestones.
We need to be picky as to what to move there.

I will announce it once we have a prototype working.


Mike




--
Zambrano Gasparnian, Armen
Automation  Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


(mozci) Triggering jobs on treeherder

2015-04-24 Thread Armen Zambrano G.

Hello all,
We wrote last quarter a project called Mozilla CI tools (mozci) which 
allows triggering jobs on treeherder. [1]


This is specially useful for back-filling jobs (specially when 
coalesced) and bisecting via job triggering.


Specifically, I want to bring to your attention a use case to help 
developers with pushing to try and using the wrong try syntax [2]. If 
you read the blog post or use cases and think of other useful use cases 
please let us know.


We want to bring to mach only the use cases which make sense to you.

Kudos to @vaibhav, @adusca, @jmaher et al for their large record of 
contributions [3].


regards,
Armen

PS=Support for B2G will be added this quarter

[1] 
http://armenzg.blogspot.ca/2015/04/what-mozilla-ci-tools-is-and-what-it.html
[2] 
http://mozilla-ci-tools.readthedocs.org/en/master/use_cases.html#case-scenario-8-developer-needs-to-add-missing-platforms-jobs-for-a-try-push

[3] https://github.com/armenzg/mozilla_ci_tools/graphs/contributors
--
Zambrano Gasparnian, Armen
Automation  Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: e10s is now enabled by default for Nightly!

2014-11-07 Thread Armen Zambrano
If we have enabled e10s for nightly, are the normal non-e10s-test jobs 
going to run with the pref off?
If not, we would be testing e10s twice on fx36, stop testing non-e10s 
fx36 and get non-e10s-regressions sneak into aurora when we uplift 
(since I believe we're locking to trunk).


regards,
Armen

On 14-11-06 07:27 PM, Chris Peterson wrote:

The patch is on mozilla-inbound and ought to hit mozilla-central in time
for tomorrow's Nightly build. \o/

https://hg.mozilla.org/integration/mozilla-inbound/rev/a75897e664dd

e10s will not ride the trains to Aurora 36. Talos and unit tests will
continue to run for e10s and non-e10s until e10s hits the Release channel.

Some known problems:

* IME and a11y will disable e10s until support is completed
* Some performance problems with Adblock Plus
* Bug  947030 - Ghostery add-on does not block trackers
* Bug  972507 - BugzillaJS add-on does not work [1]
* Bug 1008768 - LastPass add-on does not fill in form fields
* Bug 1014986 - HTTPS Everywhere add-on breaks HTTP redirects
* Bug 1042680 - Tree Style Tabs add-on does not work
* Bug 1042195 - 1Password add-on does not work
* Bug 1058542 - NoScript add-on does not work
* Bug 1093161 - Searching from address bar does not work the first time

If you have any questions, drop by #e10s on IRC. If you file new bugs
related to e10s, please include the word e10s in the summary so the
e10s team's triage queries will find your bug.


chris

[1] btw, BugzillaJS is seeking a new maintainer:
https://www.yammer.com/mozillians/#/threads/show?threadId=454089406


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Win64 builds tests coming soon!

2014-10-24 Thread Armen Zambrano G.
As of today we also have:
* Win64 opt talos
* Win64 pgo builds, tests and talos

Both of them running on Windows 8 64-bit test machines.

On graphs.m.o you should recognize the platform as WINNT 6.2 x64.

cheers,
Armen

On 14-10-21 04:00 PM, Chris AtLee wrote:
 Hi,
 
 Just a quick note that we're hoping to enable 64-bit windows builds 
 tests across most trunk branches this week. This includes branches such
 as mozilla-central, mozilla-inbound, fx-team, etc.
 
 In order to get adequate test coverage without at the same time
 overwhelming our windows test infrastructure, we've decided to disable
 32-bit windows 8 testing on these branches and enable 64-bit windows 8
 testing instead.
 
 Work is being tracked in bug 1080134 [1]. Please follow up there, or
 find me in #releng.
 
 Cheers,
 Chris
 
 [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1080134

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: How to run browser rooting analysis of mozharness locally?

2014-09-15 Thread Armen Zambrano G.
Filed to make it clearer next time:
https://bugzilla.mozilla.org/show_bug.cgi?id=1067354

Thanks for trying it!

On 14-09-15 12:11 AM, Ting-Yu Chou wrote:
 Hi,
 
 The patch of bug 1049290 failed linux64-br-haz_try_dep, I am trying to debug 
 it
 locally. I read:
 
   
 https://wiki.mozilla.org/ReleaseEngineering/Mozharness/How_to_run_tests_as_a_developer,
 
 and use the command (I didn't see installer/test url in the log):
 
   $ python scripts/scripts/spidermonkey_build.py --config-file 
 hazards/build_browser.py --config-file hazards/common.py --cfg 
 developer_config.py
 
 but it complains:
 
   11:37:00 INFO - #
   11:37:00 INFO - # Running checkout-source step.
   11:37:00 INFO - #
   11:37:00 INFO - Running main action method: checkout_source
   11:37:00FATAL - Uncaught exception: Traceback (most recent call last):
   11:37:00FATAL -   File 
 /home/ting/w/fx/bz/1049290/scripts/mozharness/base/script.py, line 1258, in 
 run
   11:37:00FATAL - self.run_action(action)
   11:37:00FATAL -   File 
 /home/ting/w/fx/bz/1049290/scripts/mozharness/base/script.py, line 1200, in 
 run_action
   11:37:00FATAL - self._possibly_run_method(method_name, 
 error_if_missing=True)
   11:37:00FATAL -   File 
 /home/ting/w/fx/bz/1049290/scripts/mozharness/base/script.py, line 1141, in 
 _possibly_run_method
   11:37:00FATAL - return getattr(self, method_name)()
   11:37:00FATAL -   File scripts/scripts/spidermonkey_build.py, line 
 33, in wrapper
   11:37:00FATAL - assert (val is not None and None not in 
 str(val)), invalid  + query.__name__
   11:37:00FATAL - AssertionError: invalid query_repo
   11:37:00FATAL - Running post_fatal callback...
   11:37:00FATAL - Exiting -1
 
 I have tried to set environment PROPERTIES_FILE with a file which has
 buildprops.json content from log, but still doesn't work.
 
 Anyone know the correct command to run the test locally?
 
 Thanks,
 Ting
 

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Running mozharness locally and be able to reach private files

2014-09-15 Thread Armen Zambrano G.
For now, this is only limited to test jobs. I should have made emphasis
on it.

My apologies about it. I was hoping to deal with different use cases as
people tried them out.

Filed:
https://bugzilla.mozilla.org/show_bug.cgi?id=1067354

On 14-09-11 08:58 AM, Armen Zambrano G. wrote:
 Hello all,
 It is now less hard to run mozharness locally by appending --cfg
 developer_config.py to production commands.
 
 Appending the config will activate a developer mode which does the
 following:
 * Remove hard coded paths for binaries
 * Substitute internal URLs to point to externally reachable URLs
 * Add Http authentication to reach private files
 * It also enforces the use of --test-url and --installer-url
 
 You can read more about it in here:
 http://armenzg.blogspot.ca/2014/09/run-tbpl-jobs-locally-with-http.html
 
 This allowed me to not have to request a Releng loaner to reproduce a
 job run on tbpl.
 
 I know that this is not the holy grail but it helps to ease the pain.
 
 
 What would people want to see in the long term to make mozharness easier
 for you?
 I've considered writing a mach command to which we can pass a URL to a
 log and does all the thinking for us.
 
 Another project that I believe would be useful is to allow a developer
 to point mozharness at local binaries. I know it is possible but I have
 not documented it since I have not tested it.
 
 There's also the project of making mach deal with mozharness the same
 way that we have mach support for talos (which is using mozharness in
 the back).
 
 regards,
 Armen
 

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Running mozharness locally and be able to reach private files

2014-09-11 Thread Armen Zambrano G.
Hello all,
It is now less hard to run mozharness locally by appending --cfg
developer_config.py to production commands.

Appending the config will activate a developer mode which does the
following:
* Remove hard coded paths for binaries
* Substitute internal URLs to point to externally reachable URLs
* Add Http authentication to reach private files
* It also enforces the use of --test-url and --installer-url

You can read more about it in here:
http://armenzg.blogspot.ca/2014/09/run-tbpl-jobs-locally-with-http.html

This allowed me to not have to request a Releng loaner to reproduce a
job run on tbpl.

I know that this is not the holy grail but it helps to ease the pain.


What would people want to see in the long term to make mozharness easier
for you?
I've considered writing a mach command to which we can pass a URL to a
log and does all the thinking for us.

Another project that I believe would be useful is to allow a developer
to point mozharness at local binaries. I know it is possible but I have
not documented it since I have not tested it.

There's also the project of making mach deal with mozharness the same
way that we have mach support for talos (which is using mozharness in
the back).

regards,
Armen
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Running mozharness locally and be able to reach private files

2014-09-11 Thread Armen Zambrano G.
On 14-09-11 09:03 AM, Joshua Cranmer  wrote:
 On 9/11/2014 7:58 AM, Armen Zambrano G. wrote:
 What would people want to see in the long term to make mozharness easier
 for you?
 
 A Dockerfile (or a container image) that produces a Ubuntu64 test slave.
 
Hi Joshua, that would be ideal, however, it does not help with the
remaining platforms (e.g. Windows).

If releng went down that route, there would be a need to use that
approach in the actual production machines to ensure that produced
dockerfiles (container image) would always be up-to-date.

If I understand correctly your suggestion is mainly to make a way to
reproduce the actual production environment rather than specifically
trying to make mozharness easier (even though production scripts would
be able to work in an external machine).

PS=I've not used docker myself

cheers,
Armen
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


MozHarness outside of loaners and releng network

2014-07-14 Thread Armen Zambrano G.
Hello,
There are some recent improvement to mozharness that helps running the
jobs outside of tbpl and releng loaned machines:
* Http authentication
** e.g. http authentication for pvtbuilds
* Url substitutions
** Hit external reachable hosts instead of internal ones
* Developer configs
** It allows overwriting production variables

The wiki has been over-hauled with this info. [1]
The improvements are also highlighted in here. [2][3]

Best regards,
Armen  Aki

[1]
https://wiki.mozilla.org/ReleaseEngineering/Mozharness/How_to_run_tests_as_a_developer
[2]
http://armenzg.blogspot.ca/2014/07/introducing-http-authentication-for.html
[3]
http://armenzg.blogspot.ca/2014/07/using-developer-configs-for-mozharness.html
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Trigger arbitrary jobs on your Try push (or other pushes)

2014-06-11 Thread Armen Zambrano G.
Hello,
A while ago, zeller, catlee and hwine worked on getting a new BuildAPI
available [1]. This allows you to trigger an arbitrary job on the
releng/tbpl system post a push.

I've put together a script that helps hitting that API [2][3].

The API allows you to use a hand zipped tests.zip which is neat if you
want to avoid waiting on builds to create it.

regards,
Armen

[1]
http://johnzeller.com/blog/2014/03/12/triggering-of-arbitrary-buildstests-is-now-possible/
[2]
http://armenzg.blogspot.ca/2014/06/who-doesnt-like-cheating-on-try-server.html
[3]
https://wiki.mozilla.org/ReleaseEngineering/How_To/Trigger_arbitrary_jobs
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: OMTC on Windows

2014-05-18 Thread Armen Zambrano G.
What kind of bugs could we expect seeing?
Any place you would like us to put focus on testing?

Thanks for all the hard work to get this in.

cheers,
Armen

On 2014-05-18, 3:16 AM, Bas Schouten wrote:
 Hey all,
 
 After quite a lot of waiting we've switched on OMTC on Windows by default 
 today (bug 899785). This is a great move towards moving all our platforms 
 onto OMTC (only linux is left now), and will allow us to remove a lot of code 
 that we've currently been duplicating. Furthermore it puts us on track for 
 enabling other features on desktop like APZ, off main thread animations and 
 other improvements.
 
 Having said that we realize that what we've currently landed and turned on is 
 not completely bug free. There's several bugs still open (some more serious 
 than others) which we will be addressing in the coming weeks, hopefully 
 before the merge to Aurora. The main reason we've switched it on now is that 
 we want to get as much data as possible from the nightly channel and our 
 nightly user base before the aurora merge, as well as wanting to prevent any 
 new regressions from creeping in while we fix the remaining problems. This 
 was extensively discussed both internally in the graphics team and externally 
 with other people and we believe we're at a point now where things are 
 sufficiently stabilized for our nightly audience. OMTC is enabled and 
 disabled with a single pref so if unforeseen, serious consequences occur we 
 can disable it quickly at any stage. We will inevitably find new bugs in the 
 coming weeks, please link any bugs you happen to come across to bug 899785, 
 if anything 
 seems ver
y serious, please let us know, we'll attempt to come up with a solution on the 
short-term rather than disabling OMTC and reducing the amount of feedback we 
get.
 
 There's also some important notes to make on performance, which we expect to 
 be reported by our automated systems:
 
 - Bug 1000640 is about WebGL. Currently OMTC regresses WebGL performance 
 considerably, patches to fix this are underway and this should be fixed on 
 the very short term.
 
 - Several of the Talos test suite numbers will change considerably 
 (especially with Direct2D enabled), this means Tscroll for example will 
 improve by ~25%, but tart will regress by ~20%, and several other suites will 
 regress as well. We've investigated this extensively and we believe the 
 majority of these regressions are due to the nature of OMTC and the fact that 
 we have to do more work. We see no value in holding off OMTC because of these 
 regressions as we'll have to go there anyway. Once the last correctness and 
 stability problems are all solved we will go back to trying to find ways to 
 get back some of the performance regressions. We're also planning to move to 
 a system more like tiling in desktop, which will change the performance 
 characteristics significantly again, so we don't want to sink too much time 
 into optimizing the current situation.
 
 - Memory numbers will increase somewhat, this is unavoidable, there's several 
 steps which have to be taken when doing off main thread compositing (like 
 double-buffering), which inherently use more memory.
 
 - On a brighter note: Async video is also enabled by these patches. This 
 means that when the main thread is busy churning JavaScript, instead of 
 stuttering your video should now happily continue playing!
 
 - Also there's some indications that there's a subjective increase in 
 scrolling performance as well.
 
 
 If you have any questions please feel free to reach out to myself or other 
 members of the graphics team!
 
 
 Bas
 


-- 
Zambrano Gasparnian, Armen (armenzg)
Mozilla Senior Release Engineer
https://mozillians.org/en-US/u/armenzg/
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Disabling b2g reftests on the minis on trunk

2014-04-10 Thread Armen Zambrano G.
Hello,
Due to bug 907770 we're going to disable b2g reftests on the trunk trees
on the Mac Rev3 minis a week earlier. We will keep running b2g reftests
on the EC2 instances.

The b2g reftests on EC2 having reliably green for several weeks already.

Next week we will disable the jobs on the minis for mozilla-aurora and
mozilla-b2g26/28.

If you want the full details or add any comments please visit:
https://bugzilla.mozilla.org/show_bug.cgi?id=994936

regards,
Armen
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Linux testing on single-core VMs nowadays

2014-04-08 Thread Armen Zambrano G.
We do talos testing on in-house machinery (iX machines with 4-core).
Not sure if that would trigger some of the issues you are hoping to be
caught.

In the future, we should be able to have some jobs run on different EC2
instance types. See https://bugzilla.mozilla.org/show_bug.cgi?id=985650
It will require lots of work but it is possible.

cheers,
Armen

On 14-04-08 03:45 AM, ishikawa wrote:
 On (2014年04月08日 15:20), Gabriele Svelto wrote:
 On 07/04/2014 23:13, Dave Hylands wrote:
 Personally, I think that the more ways we can test for threading issues the 
 better.
 It seems to me that we should do some amount of testing on single core and 
 multi-core.

 Then I suppose the question becomes how many cores? 2? 4? 8?

 Maybe we can cycle through some different number of cores so that we get 
 coverage without duplicating everything?

 One configuration that is particularly good at catching threading errors
 (especially narrow races) is constraining the software to run on two
 hardware threads on the same SMT-enabled core. This effectively forces
 the threads to share the L1 D$ which in turn can reveal some otherwise
 very-hard-to-find data synchronization issues.

 I don't know if we have that level of control on our testing hardware
 but if we do then that's a scenario we might want to include.

  Gabriele
 
 I run thunderbird under valgrind from time to time.
 
 Valgrind slows down the CPU execution by a very large factor and
 it seems to open many windows for thread races.
 (Sometimes a very short window is prolonged enough so that events caused by,
 say,
 I/O can fall inside this prolonged usually short window.)
 
 During valgrind execution,I have seen errors that were not reported
 anywhere, and many have
 happened only once :-(
 
 If VM (such as VirtualBox, VMplayer or something) can artificially
 change the execution time of CPU or even different cores slightly (maybe
 1/2, 1/3, 1/4)
 I am sure many thread-race issues will be caught.
 
 I agree that this is a brute-force approach, but please recall that the
 first space shuttle launch needed to be
 aborted due to software glitch. It was a timing issue and according to the
 analysis of the time,
 it could happen once in 72 (or was it 74) cases.
 Even NASA with a large pocket of money and its subcontractor could not catch
 it before launch.
 
 I am afraid that the situation has not changed much (unless we use a
 computer language well suited to
 avoid these thread-race issues.)
 We need all the help to track down visible and dormant thread-races.
 If artificial CPU execution tweaking (by changing the # of cores or even
 more advanced tweaking methods if available) can help, it is worth a try.
 Maybe not always if such a work cost extra money, but
 a prolonged (say a week) testing from time to time (each quarter or half a
 year, or
 maybe just prior to testing of beta of major release?).
 
 
 TIA
 

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-27 Thread Armen Zambrano G.
On 14-03-26 07:53 PM, Taras Glek wrote:
 *User Repos*
 TLDR: I would like to make user repos read-only by April 30th. We should
 archive them by May 31st.
 
 Time  spent operating user repositories could be spent reducing our 
 end-to-end continuous  integration cycles. These do not seem like
 mission-critical repos, seems like developers would be better off
 hosting these on bitbucket or github. Using a 3rd-party host has obvious
 benefits for collaboration  self-service that our existing system will
 never meet.
 
 We are happy to help move specific hg repos to bitbucket.
 
 Once you have migrated your repository, please comment in
 https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free some
 disk space.
 
 *Non-User Repos*
 There  are too many non-user repos. I'm not convinced we should host
 ash, oak, other project branches internally. I think we should focus on 
 mission-critical repos only. There should be less than a dozen of those.
 I would like to stop hosting non-mission-critical repositories by end of
 Q2.

First of all, I applaud this and it's important to get it done. However,
we need to review what is used within the releng system and the security
implications of using non-mozilla hosting for repos.

Our infra also allows on the try server to test talos repositories under
hg.m.o/users/blah. We should also get security sign-off for a different
type of hosting of those repos.

We're putting an etherpad together with repos important to releng systems.

cheers,
Armen


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-27 Thread Armen Zambrano G.
On 14-03-26 08:27 PM, Bobby Holley wrote:
 I don't understand what the overhead is. We don't run CI on user repos.
 It's effectively just ssh:// + disk space, right? That seems totally
 negligible.
 
FTR from an operations standpoint, it is never just. Never.
If it was *just* we wouldn't even be having this conversation. Trust me.

regards,
Armen
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Test harness options now defined in-tree -- please pull m-c before pushing to try!

2014-03-26 Thread Armen Zambrano G.
On 14-03-26 09:11 AM, Andrew Halberstadt wrote:
 On 26/03/14 09:06 AM, Andrew Halberstadt wrote:
 On 26/03/14 12:19 AM, Gregory Szorc wrote:
 Andrew: I'm curious if you or anyone else has given any consideration
 into making it dead simple to change the config so only a subset of
 tests will be active. e.g. I'd like to save automation resources and
 filter my browser chrome jobs to tests under toolkit/mozapps/extensions.
 This is now possible but requires modifying N config files under
 testing/config/mozharness. Could we get that down to 1 or added to Try
 syntax somehow?

 Releng (and I tend to agree with them) is very opposed to having one
 config affect multiple platforms (or at least multiple buildapps). Over
 time it tends to lead to messy and hard to follow configs. It also makes
 it harder in the case that you *do* only want to change a single
 platform.

 Another more practical reason for not doing this is that often different
 platforms have vastly different command lines (e.g b2g vs firefox vs
 fennec are all different). I can't really think of an intuitive way that
 lets us share common command lines while having different sets of
 configs for the uncommon ones. IMO editing N configs isn't that big of a
 deal.. but if you really wanted to on a project branch or something, I
 suppose you could create a global config and have the platform
 specific ones import it. If this scheme was sane and easy to understand,
 I could possibly be convinced to land it on m-c.

 Andrew
 
 To follow up, I wouldn't be opposed to having an empty global config
 that is just there to make pushing to try easier but is not used on
 production branches.

We should be able to make try be a different behaviour the same way that
talos.json allows us to test different talos repositories or revisions.

This is awesome. Well done!
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Running b2g reftests on mozilla-inbound on EC2 instances

2014-03-21 Thread Armen Zambrano G.
We're now running b2g reftests on EC2 on every trunk branch side-by-side
with the Fedora minis:
https://tbpl.mozilla.org/?jobname=b2g_emulator.*reftest

They're running around 30-40% slower. We are looking at enabling them on
faster EC2 instances so we can disable the minis.

regards,
Armen

On 14-03-14 11:14 AM, Armen Zambrano G. wrote:
 Hello,
 We're trying to move the b2g reftests from the old Mac minis to run on
 the EC2 instances.
 
 We're going to run the jobs side-by-side on mozilla-inbound for few days
 to see how it behaves with more load [1].
 
 If you see the jobs misbehaving please hide them and let us know in bug
 818968.
 
 The jobs have been running green on Elm (R5 has been recently fixed).
 
 cheers,
 Armen
 
 [1]
 https://tbpl.mozilla.org/?tree=Mozilla-Inboundjobname=b2g_emulator_vm.*elm%20opt%20test%20reftestshowall=1
 [2] https://bugzilla.mozilla.org/show_bug.cgi?id=818968
 

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Running b2g reftests on mozilla-inbound on EC2 instances

2014-03-21 Thread Armen Zambrano G.
We're now running b2g reftests on EC2 on every trunk branch side-by-side
with the Fedora minis:
https://tbpl.mozilla.org/?jobname=b2g_emulator.*reftest

They're running around 30-40% slower. We are looking at enabling them on
faster EC2 instances so we can disable the minis.

regards,
Armen

On 14-03-14 11:14 AM, Armen Zambrano G. wrote:
 Hello,
 We're trying to move the b2g reftests from the old Mac minis to run on
 the EC2 instances.
 
 We're going to run the jobs side-by-side on mozilla-inbound for few days
 to see how it behaves with more load [1].
 
 If you see the jobs misbehaving please hide them and let us know in bug
 818968.
 
 The jobs have been running green on Elm (R5 has been recently fixed).
 
 cheers,
 Armen
 
 [1]
 https://tbpl.mozilla.org/?tree=Mozilla-Inboundjobname=b2g_emulator_vm.*elm%20opt%20test%20reftestshowall=1
 [2] https://bugzilla.mozilla.org/show_bug.cgi?id=818968
 

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Running b2g reftests on mozilla-inbound on EC2 instances

2014-03-14 Thread Armen Zambrano G.
Hello,
We're trying to move the b2g reftests from the old Mac minis to run on
the EC2 instances.

We're going to run the jobs side-by-side on mozilla-inbound for few days
to see how it behaves with more load [1].

If you see the jobs misbehaving please hide them and let us know in bug
818968.

The jobs have been running green on Elm (R5 has been recently fixed).

cheers,
Armen

[1]
https://tbpl.mozilla.org/?tree=Mozilla-Inboundjobname=b2g_emulator_vm.*elm%20opt%20test%20reftestshowall=1
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=818968
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Disabling Mac + Linux32-bit builds/tests for mozilla-b2g18 and mozilla-b2g18-v1.1.0hd

2013-12-16 Thread Armen Zambrano G.
I've heard no objections.
I will proceed with this plan.

On 13-12-12 01:35 PM, Armen Zambrano G. wrote:
 +dev.b2g
 
 tl;dr
 - we want to disable Mac + Linux32-bit build/tests for mozilla-b2g18 and
 mozilla-b2g18-v1.1.0hd
 
 Hi Ryan,
 I would be fine of taking care of it if I hear no objections by next week.
 
 On 13-12-12 12:12 PM, Ryan VanderMeulen wrote:
 On 12/12/2013 11:58 AM, arme...@mozilla.com wrote:
 Hello all,
 After disabling the Win7 tests on those b2g18 trees, I was told by the
 sheriffs that the MacOS X testing is redundant.

 Again:
 * these are repositories where only security fixes are being landed
 * this is not about b2g26
 * we have testing coverage through Linux, Linux64, Android Noion + B2g
 emulator test jobs

 If you have any concerns/questions/objections please raise them up in:
 https://bugzilla.mozilla.org/show_bug.cgi?id=949522

 cheers,
 Armen

 https://tbpl.mozilla.org/?tree=Mozilla-B2g18
 https://tbpl.mozilla.org/?tree=Mozilla-B2g18-v1.1.0hd


 Are we getting any value from testing both Linux32 and Linux64 on b2g18*
 at this point? Linux32 in particular suffers from a good number of
 intermittents that are near perma-fail. Given the support status of the
 branch, I'm resigned to just riding them out, but killing the builds
 would take of it too. Linux64 is much better from a failure standpoint.
 

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Disable Windows builds and nightly builds on b2g18 and b2g18-v1.0.0-hd

2013-12-10 Thread Armen Zambrano G.
As I was looking into the patches I noticed that if we remove the
Windows tests I would not be able to find a good reason to trigger the
win32opt  debug builds as they would be triggering nothing.

I will be disabling the win32 builds as well.

Please comment in the bug in case you have any concerns about it:
https://bugzilla.mozilla.org/show_bug.cgi?id=948135

cheers,
Armen

On 13-12-09 01:52 PM, Armen Zambrano G. wrote:
 Changing the subject to make it more clear.
 I will be raising this on the Monday weekly call as well as the
 Engineering meeting.
 
 On 13-12-07 03:31 PM, Armen Zambrano G. wrote:
 (Please follow up on mozilla.dev.b2g)

 Adding dev.platform to reach a wider audience.

 I will bring this up at the Engineering meeting on Tuesday in case I
 don't hear anything back by Tuesday.

 cheers,
 Armen

 On 13-12-06 02:50 PM, Armen Zambrano G. wrote:
 Hi all,
 Given that we are only doing security fixes on those two branches.
 Anyone can say if we still need desktop unit tests and nightly builds on
 those two branches? [1][2]
 Would running B2G builds/tests and linux32 build/tests be good enough?

 Those jobs are running on the old Rev3 minis and would be great to get
 rid of them now rather than wait until March.

 regards,
 Armen


 [1] https://tbpl.mozilla.org/?tree=Mozilla-B2g18
 [2] https://tbpl.mozilla.org/?tree=Mozilla-B2g18-v1.1.0hd


 

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Disabling Windows 7 testing and nightly builds on b2g18 and b2g18-v1.0.0-hd

2013-12-09 Thread Armen Zambrano G.
Changing the subject to make it more clear.
I will be raising this on the Monday weekly call as well as the
Engineering meeting.

On 13-12-07 03:31 PM, Armen Zambrano G. wrote:
 (Please follow up on mozilla.dev.b2g)
 
 Adding dev.platform to reach a wider audience.
 
 I will bring this up at the Engineering meeting on Tuesday in case I
 don't hear anything back by Tuesday.
 
 cheers,
 Armen
 
 On 13-12-06 02:50 PM, Armen Zambrano G. wrote:
 Hi all,
 Given that we are only doing security fixes on those two branches.
 Anyone can say if we still need desktop unit tests and nightly builds on
 those two branches? [1][2]
 Would running B2G builds/tests and linux32 build/tests be good enough?

 Those jobs are running on the old Rev3 minis and would be great to get
 rid of them now rather than wait until March.

 regards,
 Armen


 [1] https://tbpl.mozilla.org/?tree=Mozilla-B2g18
 [2] https://tbpl.mozilla.org/?tree=Mozilla-B2g18-v1.1.0hd

 

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Desktop unit tests and nightly builds on b2g18 and b2g18-v1.0.0-hd

2013-12-07 Thread Armen Zambrano G.

(Please follow up on mozilla.dev.b2g)

Adding dev.platform to reach a wider audience.

I will bring this up at the Engineering meeting on Tuesday in case I 
don't hear anything back by Tuesday.


cheers,
Armen

On 13-12-06 02:50 PM, Armen Zambrano G. wrote:

Hi all,
Given that we are only doing security fixes on those two branches.
Anyone can say if we still need desktop unit tests and nightly builds on
those two branches? [1][2]
Would running B2G builds/tests and linux32 build/tests be good enough?

Those jobs are running on the old Rev3 minis and would be great to get
rid of them now rather than wait until March.

regards,
Armen


[1] https://tbpl.mozilla.org/?tree=Mozilla-B2g18
[2] https://tbpl.mozilla.org/?tree=Mozilla-B2g18-v1.1.0hd



___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Killing ESR17 builds tests on tbpl

2013-12-02 Thread Armen Zambrano G.
Hello all,
Next week, we will have our next merge date [1] on Dec. 9th, 2013.
As part of that merge day we will be killing the ESR17 builds and tests
[2][3] for tbpl.mozilla.org.

This is part of our normal process where two merge days after the
creation of the latest ESR release (e.g. ESR24 [3]) we obsolete the last
one (e.g. ESR17).

On an unrelated note to this post, we will be creating updates from
ESR17 to ESR24 even after that date, however, we will have no more
builds and tests on check-in.

Please let me know if you have any questions.

regards,
Armen
#
Zambrano Gasparnian, Armen (armenzg)
Mozilla Senior Release Engineer
https://mozillians.org/en-US/u/armenzg/
http://armenzg.blogspot.ca


[1] https://wiki.mozilla.org/RapidRelease/Calendar
[2] https://wiki.mozilla.org/Enterprise/Firefox/ExtendedSupport:Proposal
[3] https://tbpl.mozilla.org/?tree=Mozilla-Esr17
[4] https://tbpl.mozilla.org/?tree=Mozilla-Esr24
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Disabling today the 10.7 test infrastructure across the board

2013-11-26 Thread Armen Zambrano G.
This is part 1 of the thread Proposed changes to RelEng's OSX build and
test infrastructure.

I wanted to make it very clear so no one is surprised when it happens.

cheers,
Armen


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


[completed] Re: Disabling today the 10.7 test infrastructure across the board

2013-11-26 Thread Armen Zambrano G.
This has been completed.
All 10.7 jobs have been disabled.
Those 10.7 machines will be re-purposed as 10.6 machines.
We should have them all re-imaged and operational by next week.

If you find any issues please use
https://bugzilla.mozilla.org/show_bug.cgi?id=942299

cheers,
Armen

On 11/26/2013, 10:11 AM, Armen Zambrano G. wrote:
 This is part 1 of the thread Proposed changes to RelEng's OSX build and
 test infrastructure.
 
 I wanted to make it very clear so no one is surprised when it happens.
 
 cheers,
 Armen
 

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Proposed changes to RelEng's OSX build and test infrastructure

2013-11-25 Thread Armen Zambrano G.
On 11/22/2013, 3:44 PM, Johnathan Nightingale wrote:
 On Nov 22, 2013, at 12:29 PM, Ted Mielczarek wrote:
 
 On 11/21/2013 4:56 PM, John O'Duinn wrote:
 6) If a developer lands a patch that works on 10.9, but it fails somehow
 on 10.7 or 10.8, it is unlikely that we would back out the fix, and we
 would instead tell users to upgrade to 10.9 anyways, for the security
 fixes.
 
 This seems to go against our historical policy. While it's true that we
 might not back a patch out for 10.7/10.8 failures (since we won't have
 automated test coverage), if they're still supported platforms then we
 would still look to fix the bug. That might require backing a patch out
 or landing a new fix. I don't think we need to over-rotate on this, this
 is no different than any of the myriad of regressions or bugs we have
 reported by users with software configurations different than what we're
 able to run tests on.

 I would instead simply say 10.7 and 10.8 will remain supported OSes,
 and bugs affecting only those platforms will be considered and
 prioritized as necessary. It sounds a little weasely when I write it
 that way, but I don't think we should WONTFIX bugs just because they're
 on a supported platform without test coverage, we'd simply treat them as
 we would any other bug a user reports: something we ought to fix,
 prioritized as is seen fit by developers.
 
 
 I agree - we have not decided to mark 10.7 or 10.8 as tier 2 or otherwise 
 less supported. I don't mind assuming that 10.6/10.9 tests oughta catch most 
 of the problems, but if they miss one and we break 10.7/10.8, I'd expect us 
 to find a solution for that, or back out if the bustage is significant and 
 not easily fixable.
 
 J
 
 ---
 Johnathan Nightingale
 VP Firefox
 @johnath
 

Thanks Jonathan. Yes, it makes sense to me that approach.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Proposed changes to RelEng's OSX build and test infrastructure

2013-11-22 Thread Armen Zambrano Gasparnian


- Original Message -
 From: Mike Hommey m...@glandium.org
 To: John O'Duinn jodu...@mozilla.com
 Cc: dev. planning dev-plann...@lists.mozilla.org, 
 dev-platform@lists.mozilla.org, release rele...@mozilla.com
 Sent: Friday, November 22, 2013 1:34:46 AM
 Subject: Re: Proposed changes to RelEng's OSX build and test infrastructure
 
 On Thu, Nov 21, 2013 at 04:56:50PM -0500, John O'Duinn wrote:
  6) If a developer lands a patch that works on 10.9, but it fails somehow
  on 10.7 or 10.8, it is unlikely that we would back out the fix, and we
  would instead tell users to upgrade to 10.9 anyways, for the security
  fixes.
 
 It's not because we tell users to do so that they will. Chances are they
 will choose another browser that works on their OS than upgrade it. I do
 know a few people that don't like the new things in 10.8 and are keeping
 10.7 on purpose.
 
 How much effort would it be to image the test machines such that they
 can boot multiple versions of OSX, and have them reboot under 10.7 and
 10.8 instead of 10.6 and 10.9 when demand is low, and run tests on those
 platforms?
 
We have thought of this in the past, however, we have always been getting 
higher up goals.
We would have to consult with RelOps (IT's supporting branch for Releng) and it 
would require some re-work on the imaging systems.
In fact, we're currently working on switching from our current system, 
DeployStudio, to casper.

On the scheduling side it will require a decent amount of re-factoring. This 
would have to be added to our planned re-writing of our scheduling system.

This would still have to deal with not getting immediate feedback for commits 
on 10.7 and 10.8 and how to deal with backouts and regression hunting.

 Mike
 
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: how long are we continuing 32-bit OS X support?

2013-10-23 Thread Armen Zambrano G.
Releng note: if we stop it, we could also get much better test capacity
on tbpl as we could re-purpose our 10.6 test infrastructure.

cheers,
Armen

On 2013-10-21 1:14 PM, Ehsan Akhgari wrote:
 Note that we also use this to support 32-bit plugins, so our target
 audience is not just 10.6 users.
 
 Cheers,
 Ehsan
 
 On 2013-10-21 11:24 AM, Nathan Froyd wrote:
 [Not sure if this is strictly dev-platform material; dev-planning
 might have been more appropriate in some respects.]

 Our Firefox builds for OS X currently build a 32-bit version, a 64-bit
 version, and then squash those together to produce a universal binary
 that runs on 32-bit or 64-bit systems, as appropriate.  This method of
 building is somewhat wasteful, as we're building a decent chunk of
 stuff (chrome JS, etc.) that is architecture independent.

 When targeting OS X, native or cross, clang supports compiling for
 multiple architectures in a single pass.  Being able to use this
 capability would likely speed up our TBPL OS X builds somewhat. 
 (Slowdown in actual compilation would likely be compensated for by not
 having to build chrome jars twice, install headers twice, etc. etc.) 
 However, a variety of issues block being able to do this, mostly
 related to a number of places the code where decisions are made based
 on whether configure thought we're compiling for a 32-bit or 64-bit
 architecture.

 I've started poking at some of these issues (bug 925167 and dependents
 if you're interested).  But various people have pointed out that it
 only makes sense to devote resources to this if we're going to be
 keeping 32-bit OS X support alive for some time to come.  Otherwise,
 we can just wait a short amount of time for 32-bit support to be
 dropped and these issues go away.

 How long do we intend to continue shipping a 32-bit Firefox binary on
 OS X?  As I understand it, we're doing this solely for our OS X 10.6
 users, as they are the only ones potentially running OS X on
 non-64-bit capable machines.

 -Nathan
 ___
 dev-platform mailing list
 dev-platform@lists.mozilla.org
 https://lists.mozilla.org/listinfo/dev-platform

 

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Wanted: Feedback on Windows 64-bit rev2 cutover options

2013-10-17 Thread Armen Zambrano Gasparnian
Option #3 sounds logical.

What way would sheriffs/devs have to determine which machines are rev2
and which ones are rev1?
Should they use the info in production_config.py?
- rev2 machines:  [1]
- rev2 (try): currently empty [2]

[1]
http://hg.mozilla.org/build/buildbot-configs/file/production/mozilla/production_config.py#l8
[2]
http://hg.mozilla.org/build/buildbot-configs/file/production/mozilla/production_config.py#l33

On 2013-10-17 2:34 PM, John Hopkins wrote:
 Here are three different proposals to cut over to the new Windows 64-bit
 rev2 (win64-rev2) machines (see
 https://groups.google.com/forum/#!topic/mozilla.dev.platform/zACrUe_JwKw
 for context), along with some of the pros and cons of each approach.
 I would prefer option #3 (gradual phase-in) so long as we're okay with
 having a mixed pool of rev1/rev2 win64 build machines on the same
 branches.  Timing will depend on which option we go with.

 Please let me know as soon as possible your preference or whether you
 have any other comments/concerns.  I will assume option #3 if there are
 no objections by end of day Friday.

 In addition, I'd like to have a smoke test performed against the Windows
 builds on one of our project branches (which have already been cut over)
 -  fx-team or ux seem like good candidates.  I will follow up with the
 QA team.


 Win64-rev2 Cutover Proposals:

 1) Cut over all of inbound/try/central to win64-rev2 over a weekend.
 Pros:
 - all branches built using the same machine image
 - lowest volume of checkins happen on weekends, so less wait time impact
 Cons:
 - big bang upgrade.  If someone discovers an issue on Monday, a
 lengthy downtime could be required to resolve it.
 - build wait times on the weekend could be lengthy or require a tree closure
 - staffing weekend work is problematic


 2) Cut over inbound/try/central branches one at a time, each early on a
 Monday (over several weeks)
 Pros:
 - more people around to find/fix problems
 Cons:
 - longer wait times
 - not all branches using the same build machine image
 - longer cutover time


 3) Gradual phase-in of win64-rev2 machines on inbound/try/central in a
 mixed rev1/rev2 pool
 Pros:
 - limited impact of bustages (due to mixing in a small number of rev2
 build machines to start and gradually increasing)
 - no impact on wait times (could even improve them slightly since we're
 a bit low on rev1 capacity at the moment)
 - no weekend work required, can be done during the week as time permits
 Cons:
 - mix of rev1/rev2 build machines, mitigated by having exclusively rev2
 allocated to project branches for testing rev2-specific bustage fixes


 Note: This will not impact release branches (aurora, beta, release,
 esr).  We will cut over to those branches only once win64 rev2 has been
 proven on inbound/try/central.  Bug 781277 is the overall tracking bug.

 Thanks,
 John



-- 
Zambrano Gasparnian, Armen (armenzg)
Mozilla's Release Engineer
https://mozillians.org/en-US/u/armenzg/
http://armenzg.blogspot.ca

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Switching talos jobs to not grab files from http://build.mozilla.org/talos

2013-08-15 Thread Armen Zambrano G.
Hi,
After Sep. 30th we will not be grabbing files anymore from
http://build.mozilla.org/talos but from
http://talos-bundles.pvt.build.mozilla.org.

As of today, all changes have landed and made live for all of our
development trees (including esr and b2g18 trees).

Any _talos_ jobs that are pushed to the try server with older changesets
[1] or any _talos_ jobs re-triggered on older changesets will fail.

For this not to happen on the try pushes make sure to change the
following two files:
- testing/talos/talos.json
- testing/talos/talos_from_code.py

For re-triggers, you will have to push to the try server with the
mentioned changes.

best regards,
Armen

[1]
https://hg.mozilla.org/mozilla-central/rev/d5a9b3ef1706
https://hg.mozilla.org/releases/mozilla-aurora/rev/932868752f82
https://hg.mozilla.org/releases/mozilla-beta/rev/32c8013dbee9
https://hg.mozilla.org/releases/mozilla-release/rev/b5e2368423eb
https://hg.mozilla.org/releases/mozilla-esr17/rev/8ac7eb904308
https://hg.mozilla.org/releases/mozilla-b2g18/rev/5bd5ac591922
https://hg.mozilla.org/releases/mozilla-b2g18_v1_1_0_hd/rev/30741bd27846
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


talos mozharness status

2013-08-01 Thread Armen Zambrano G.

Hi,
* most of the ts test performance hit that we were expecting should have 
finished being reported

* we had missed enabling talos mozharness for the PGO builders
** this means that the PGO branches will start reporting today and might 
take a day or two to clear up (like the other branches)
* we want to enable it also for FF24 in preparation for ESR24 and not 
have to maintain two different code paths and kill one when ESR17 dies
* there are some minor bugs that were discovered with major load which 
have been filed and we will tackle as soon as possible


regards,
Armen

On 2013-07-30 10:17 AM, Armen Zambrano G. wrote:

Talos mozharness is now live on all FF25 trees.
Remember that we will see some ts regressions.

More info in:
http://armenzg.blogspot.ca/2013/07/enabling-talos-mozharness-for-ff25.html

This will ride the trains.

cheers,
Jason  Armen

On 2013-07-29 1:04 PM, Armen Zambrano G. wrote:

This will be going live tomorrow Tuesday 30th.

On 2013-07-23 4:38 PM, Armen Zambrano G. wrote:

I need these new changesets to spread across the FF25 trees before going
ahead with this:
https://hg.mozilla.org/integration/mozilla-inbound/rev/0d4ab37e3f3e
https://hg.mozilla.org/integration/mozilla-inbound/rev/496a7582cf9e

I'm postponing this until Monday.
Sorry for the noise. I want to make sure that it all goes as expected.

cheers,
Jason  Armen

On 2013-07-22 2:44 PM, Armen Zambrano G. wrote:

Last week we enabled mozharness for talos on the try server and we have
resolved all found issues since then. The issues were related to proper
integration with tbpl and talos's try support.

We will switch talos jobs to be driven by mozharness rather than
through
buildbot by Wednesday morning in the morning of EDT.

I assume that changeset 3d1c2ca7efe8 is already on your local checkout
after a week being in the tree but worth raising it up again.
  There's one thing to do on your part if you want to not have failing
  *talos* jobs on the try server, make sure that the changeset
  3d1c2ca7efe8 is in your local checkout [5][6]. If you have updated
  your repo from m-i by Friday 12th at 10:19AM PDT you should be good
  to go.

regards,
Jason  Armen

[5] http://hg.mozilla.org/integration/mozilla-inbound/rev/3d1c2ca7efe8
[6] http://hg.mozilla.org/mozilla-central/rev/3d1c2ca7efe8

On 2013-07-16 8:51 AM, Armen Zambrano G. wrote:

Hi,
We have recently been working hard to separate the buildbot logic that
runs our talos jobs on tbpl to its own separate script (using
mozharness). [1][2]

This has the advantage of permitting anyone (specially the a-team) to
adjust how our harnesses run talos inside of our infrastructure
without
having to set up buildbot (which is what currently runs our talos
jobs).
This also permits anyone to run the jobs locally in the same manner as
Releng's infrastructure. This also allows for further development and
flexibility on how we configure the jobs we run.

Initially, we will enable it on the try server today to see
production-like load. So far, it's been looking great on Cedar. [3]

The only gotcha is that there will be a small performance hit for
the ts
tests that we are willing to take. [4]

There's one thing to do on your part if you want to not have failing
*talos* jobs on the try server, make sure that the changeset
3d1c2ca7efe8 is in your local checkout [5][6]. If you have updated
your
repo from m-i by Friday 12th at 10:19AM PDT you should be good to go.

Once we get a couple of days worth of load on the try server and see
nothing new we will go ahead and enable it for every m-c based
repository.

If you have any questions/concerns please write a comment on bug
713055.

Best regards,
Jason  Armen
Release Engineering

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=713055
[2] https://developer.mozilla.org/en-US/docs/Mozharness_FAQ
[3] https://tbpl.mozilla.org/?tree=Cedarjobname=talos
[4] https://bugzilla.mozilla.org/show_bug.cgi?id=802801#c10
[5] http://hg.mozilla.org/integration/mozilla-inbound/rev/3d1c2ca7efe8
[6] http://hg.mozilla.org/mozilla-central/rev/3d1c2ca7efe8










___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


[LIVE] Enabling mozharness for talos for FF25 projects

2013-07-30 Thread Armen Zambrano G.

Talos mozharness is now live on all FF25 trees.
Remember that we will see some ts regressions.

More info in: 
http://armenzg.blogspot.ca/2013/07/enabling-talos-mozharness-for-ff25.html


This will ride the trains.

cheers,
Jason  Armen

On 2013-07-29 1:04 PM, Armen Zambrano G. wrote:

This will be going live tomorrow Tuesday 30th.

On 2013-07-23 4:38 PM, Armen Zambrano G. wrote:

I need these new changesets to spread across the FF25 trees before going
ahead with this:
https://hg.mozilla.org/integration/mozilla-inbound/rev/0d4ab37e3f3e
https://hg.mozilla.org/integration/mozilla-inbound/rev/496a7582cf9e

I'm postponing this until Monday.
Sorry for the noise. I want to make sure that it all goes as expected.

cheers,
Jason  Armen

On 2013-07-22 2:44 PM, Armen Zambrano G. wrote:

Last week we enabled mozharness for talos on the try server and we have
resolved all found issues since then. The issues were related to proper
integration with tbpl and talos's try support.

We will switch talos jobs to be driven by mozharness rather than through
buildbot by Wednesday morning in the morning of EDT.

I assume that changeset 3d1c2ca7efe8 is already on your local checkout
after a week being in the tree but worth raising it up again.
  There's one thing to do on your part if you want to not have failing
  *talos* jobs on the try server, make sure that the changeset
  3d1c2ca7efe8 is in your local checkout [5][6]. If you have updated
  your repo from m-i by Friday 12th at 10:19AM PDT you should be good
  to go.

regards,
Jason  Armen

[5] http://hg.mozilla.org/integration/mozilla-inbound/rev/3d1c2ca7efe8
[6] http://hg.mozilla.org/mozilla-central/rev/3d1c2ca7efe8

On 2013-07-16 8:51 AM, Armen Zambrano G. wrote:

Hi,
We have recently been working hard to separate the buildbot logic that
runs our talos jobs on tbpl to its own separate script (using
mozharness). [1][2]

This has the advantage of permitting anyone (specially the a-team) to
adjust how our harnesses run talos inside of our infrastructure without
having to set up buildbot (which is what currently runs our talos
jobs).
This also permits anyone to run the jobs locally in the same manner as
Releng's infrastructure. This also allows for further development and
flexibility on how we configure the jobs we run.

Initially, we will enable it on the try server today to see
production-like load. So far, it's been looking great on Cedar. [3]

The only gotcha is that there will be a small performance hit for
the ts
tests that we are willing to take. [4]

There's one thing to do on your part if you want to not have failing
*talos* jobs on the try server, make sure that the changeset
3d1c2ca7efe8 is in your local checkout [5][6]. If you have updated your
repo from m-i by Friday 12th at 10:19AM PDT you should be good to go.

Once we get a couple of days worth of load on the try server and see
nothing new we will go ahead and enable it for every m-c based
repository.

If you have any questions/concerns please write a comment on bug
713055.

Best regards,
Jason  Armen
Release Engineering

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=713055
[2] https://developer.mozilla.org/en-US/docs/Mozharness_FAQ
[3] https://tbpl.mozilla.org/?tree=Cedarjobname=talos
[4] https://bugzilla.mozilla.org/show_bug.cgi?id=802801#c10
[5] http://hg.mozilla.org/integration/mozilla-inbound/rev/3d1c2ca7efe8
[6] http://hg.mozilla.org/mozilla-central/rev/3d1c2ca7efe8








___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: [RE-SCHEDULED] Re: Enabling mozharness for talos for FF25 projects

2013-07-29 Thread Armen Zambrano G.

This will be going live tomorrow Tuesday 30th.

On 2013-07-23 4:38 PM, Armen Zambrano G. wrote:

I need these new changesets to spread across the FF25 trees before going
ahead with this:
https://hg.mozilla.org/integration/mozilla-inbound/rev/0d4ab37e3f3e
https://hg.mozilla.org/integration/mozilla-inbound/rev/496a7582cf9e

I'm postponing this until Monday.
Sorry for the noise. I want to make sure that it all goes as expected.

cheers,
Jason  Armen

On 2013-07-22 2:44 PM, Armen Zambrano G. wrote:

Last week we enabled mozharness for talos on the try server and we have
resolved all found issues since then. The issues were related to proper
integration with tbpl and talos's try support.

We will switch talos jobs to be driven by mozharness rather than through
buildbot by Wednesday morning in the morning of EDT.

I assume that changeset 3d1c2ca7efe8 is already on your local checkout
after a week being in the tree but worth raising it up again.
  There's one thing to do on your part if you want to not have failing
  *talos* jobs on the try server, make sure that the changeset
  3d1c2ca7efe8 is in your local checkout [5][6]. If you have updated
  your repo from m-i by Friday 12th at 10:19AM PDT you should be good
  to go.

regards,
Jason  Armen

[5] http://hg.mozilla.org/integration/mozilla-inbound/rev/3d1c2ca7efe8
[6] http://hg.mozilla.org/mozilla-central/rev/3d1c2ca7efe8

On 2013-07-16 8:51 AM, Armen Zambrano G. wrote:

Hi,
We have recently been working hard to separate the buildbot logic that
runs our talos jobs on tbpl to its own separate script (using
mozharness). [1][2]

This has the advantage of permitting anyone (specially the a-team) to
adjust how our harnesses run talos inside of our infrastructure without
having to set up buildbot (which is what currently runs our talos jobs).
This also permits anyone to run the jobs locally in the same manner as
Releng's infrastructure. This also allows for further development and
flexibility on how we configure the jobs we run.

Initially, we will enable it on the try server today to see
production-like load. So far, it's been looking great on Cedar. [3]

The only gotcha is that there will be a small performance hit for the ts
tests that we are willing to take. [4]

There's one thing to do on your part if you want to not have failing
*talos* jobs on the try server, make sure that the changeset
3d1c2ca7efe8 is in your local checkout [5][6]. If you have updated your
repo from m-i by Friday 12th at 10:19AM PDT you should be good to go.

Once we get a couple of days worth of load on the try server and see
nothing new we will go ahead and enable it for every m-c based
repository.

If you have any questions/concerns please write a comment on bug 713055.

Best regards,
Jason  Armen
Release Engineering

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=713055
[2] https://developer.mozilla.org/en-US/docs/Mozharness_FAQ
[3] https://tbpl.mozilla.org/?tree=Cedarjobname=talos
[4] https://bugzilla.mozilla.org/show_bug.cgi?id=802801#c10
[5] http://hg.mozilla.org/integration/mozilla-inbound/rev/3d1c2ca7efe8
[6] http://hg.mozilla.org/mozilla-central/rev/3d1c2ca7efe8






___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Enabling mozharness for talos for FF25 projects (was Re: Running talos through mozharness)

2013-07-22 Thread Armen Zambrano G.
Last week we enabled mozharness for talos on the try server and we have 
resolved all found issues since then. The issues were related to proper 
integration with tbpl and talos's try support.


We will switch talos jobs to be driven by mozharness rather than through 
buildbot by Wednesday morning in the morning of EDT.


I assume that changeset 3d1c2ca7efe8 is already on your local checkout 
after a week being in the tree but worth raising it up again.

 There's one thing to do on your part if you want to not have failing
 *talos* jobs on the try server, make sure that the changeset
 3d1c2ca7efe8 is in your local checkout [5][6]. If you have updated
 your repo from m-i by Friday 12th at 10:19AM PDT you should be good
 to go.

regards,
Jason  Armen

[5] http://hg.mozilla.org/integration/mozilla-inbound/rev/3d1c2ca7efe8
[6] http://hg.mozilla.org/mozilla-central/rev/3d1c2ca7efe8

On 2013-07-16 8:51 AM, Armen Zambrano G. wrote:

Hi,
We have recently been working hard to separate the buildbot logic that
runs our talos jobs on tbpl to its own separate script (using
mozharness). [1][2]

This has the advantage of permitting anyone (specially the a-team) to
adjust how our harnesses run talos inside of our infrastructure without
having to set up buildbot (which is what currently runs our talos jobs).
This also permits anyone to run the jobs locally in the same manner as
Releng's infrastructure. This also allows for further development and
flexibility on how we configure the jobs we run.

Initially, we will enable it on the try server today to see
production-like load. So far, it's been looking great on Cedar. [3]

The only gotcha is that there will be a small performance hit for the ts
tests that we are willing to take. [4]

There's one thing to do on your part if you want to not have failing
*talos* jobs on the try server, make sure that the changeset
3d1c2ca7efe8 is in your local checkout [5][6]. If you have updated your
repo from m-i by Friday 12th at 10:19AM PDT you should be good to go.

Once we get a couple of days worth of load on the try server and see
nothing new we will go ahead and enable it for every m-c based repository.

If you have any questions/concerns please write a comment on bug 713055.

Best regards,
Jason  Armen
Release Engineering

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=713055
[2] https://developer.mozilla.org/en-US/docs/Mozharness_FAQ
[3] https://tbpl.mozilla.org/?tree=Cedarjobname=talos
[4] https://bugzilla.mozilla.org/show_bug.cgi?id=802801#c10
[5] http://hg.mozilla.org/integration/mozilla-inbound/rev/3d1c2ca7efe8
[6] http://hg.mozilla.org/mozilla-central/rev/3d1c2ca7efe8


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Running talos through mozharness

2013-07-16 Thread Armen Zambrano G.

Hi,
We have recently been working hard to separate the buildbot logic that 
runs our talos jobs on tbpl to its own separate script (using 
mozharness). [1][2]


This has the advantage of permitting anyone (specially the a-team) to 
adjust how our harnesses run talos inside of our infrastructure without 
having to set up buildbot (which is what currently runs our talos jobs). 
This also permits anyone to run the jobs locally in the same manner as 
Releng's infrastructure. This also allows for further development and 
flexibility on how we configure the jobs we run.


Initially, we will enable it on the try server today to see 
production-like load. So far, it's been looking great on Cedar. [3]


The only gotcha is that there will be a small performance hit for the ts 
tests that we are willing to take. [4]


There's one thing to do on your part if you want to not have failing 
*talos* jobs on the try server, make sure that the changeset 
3d1c2ca7efe8 is in your local checkout [5][6]. If you have updated your 
repo from m-i by Friday 12th at 10:19AM PDT you should be good to go.


Once we get a couple of days worth of load on the try server and see 
nothing new we will go ahead and enable it for every m-c based repository.


If you have any questions/concerns please write a comment on bug 713055.

Best regards,
Jason  Armen
Release Engineering

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=713055
[2] https://developer.mozilla.org/en-US/docs/Mozharness_FAQ
[3] https://tbpl.mozilla.org/?tree=Cedarjobname=talos
[4] https://bugzilla.mozilla.org/show_bug.cgi?id=802801#c10
[5] http://hg.mozilla.org/integration/mozilla-inbound/rev/3d1c2ca7efe8
[6] http://hg.mozilla.org/mozilla-central/rev/3d1c2ca7efe8
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


New Windows 7 and XP try syntax

2013-07-10 Thread Armen Zambrano G.

FYI

Before:
try: -b do -p win32 -u all[5.1,6.1] -t none
Now:
try: -b do -p win32 -u all[Windows XP,Windows 7] -t none

This trigger jobs on the iX hardware rather than the Rev3 minis.
FYI we disabled today jobs on Rev3 minis.
The Try Syntax Chooser page has also been updated.

cheers,
Armen

https://bugzilla.mozilla.org/show_bug.cgi?id=877465
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Hide Windows 7 and Windows XP jobs for Rev3 minis

2013-07-10 Thread Armen Zambrano G.

I did not do this last night.
I will be doing this now and take into account the request of mbrubeck.

On 2013-05-22 11:38 AM, Matt Brubeck wrote:

On 5/22/2013 10:22 AM, Armen Zambrano G. wrote:

We would like to determine if we are ready to stop running jobs on the
rev3 minis before stopping them.

If you have no objections I will hide the builders by EOD.
We can re-visit on Monday if we are ready to go ahead and stop running
jobs on rev3 minis.


My one concern is bug 859571, which prevents us from getting useful
ts_paint data on the iX talos slaves:
https://bugzilla.mozilla.org/show_bug.cgi?id=859571

If possible, we should keep the talos other and talos dirtypaint
jobs running and visible on the old hardware until that is fixed.


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Disable this Thursday Windows 7 and Windows XP on Rev3 machines

2013-07-10 Thread Armen Zambrano G.

This is live since this morning.
Thank you all that helped make this happen.

http://armenzg.blogspot.com/2013/05/kiss-old-testing-infra-revision-3-mac.html

On 2013-05-28 1:08 PM, Armen Zambrano G. wrote:

(posting to dev.platform and dev.tree-management)

Hello all,
We have been running unit tests and talos jobs on the iX hardware for a
while side by side with the Rev3 minis.

We have disabled the Rev3 minis jobs for few days in on our main tbpl
pages and we have not missed them AFAIK.

We have some known intermittent oranges but nothing that has been shown
as a show-stopper.

Unless I hear any strong objections we will be stop running jobs on the
rev3 minis for FF23 (m-a)  FF24 (m-c) based projects.

best regards,
Armen
Mozilla's Release Engineering


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Pink pixel of death (reftest failures due to bad RAM)

2013-07-10 Thread Armen Zambrano G.

Hi all,
We have found that sometimes we fail our reftest tests due to a couple 
of pixels getting store in bad sectors on the RAM of our machines.

We have also seen garbage collection crashes due to it.

We have also recently discovered that memtest has done a better job at 
catching bad RAM compared to Apple and other diagnostic tools.


The new problem we have is that it also started happening on Windows iX 
machines (one machine so far).


Do you have any ideas on how to make our tests to handle this problem in 
a better way?

Do you have any suggestions of a better tool for memory issues on Windows?
Do you have any ideas on how to quickly check if a memory replacement 
has fixed the issue?


best regards,
Armen

[1] 
https://tbpl.mozilla.org/php/getParsedLog.php?id=23485895tree=Mozilla-Central#error1

[2] https://bugzilla.mozilla.org/show_bug.cgi?id=857705
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Disable this Thursday Windows 7 and Windows XP on Rev3 machines

2013-07-10 Thread Armen Zambrano G.

(posting to dev.platform and dev.tree-management)

Hello all,
We have been running unit tests and talos jobs on the iX hardware for a 
while side by side with the Rev3 minis.


We have disabled the Rev3 minis jobs for few days in on our main tbpl 
pages and we have not missed them AFAIK.


We have some known intermittent oranges but nothing that has been shown 
as a show-stopper.


Unless I hear any strong objections we will be stop running jobs on the 
rev3 minis for FF23 (m-a)  FF24 (m-c) based projects.


best regards,
Armen
Mozilla's Release Engineering
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Disabled Gaia UI tests on pandas - any reason for B2G *panda* builds?

2013-07-10 Thread Armen Zambrano G.

Hi,
Today, we disabled Gaia UI tests on the B2G pandas [1][2][3]. The reason 
is that they were broken and our testing plan is to test on Desktop 
builds rather than pandas [3].


At some point we used to run these jobs across all trees, then we hid 
them, then we only left them running on Cedar and Gaia-Master.


If we don't have a need to test on panda boards, are there any reasons 
left to keep on creating the B2G panda builds?


Looking forward to hear back from you.

cheers,
Armen
https://mozillians.org/en-US/u/armenzg

--

[1] 
https://tbpl.mozilla.org/?tree=Cedarshowall=1jobname=b2g_panda%20cedar%20opt%20test%20gaia-ui-test

[2] https://tbpl-dev.allizom.org/?tree=Gaia-Master
[3] http://pandaboard.org/node/300/#PandaES
[4]
A) It is easier to test on B2G Desktop builds than on Panda boards
B) We don't gain anything from testing on Panda boards (no radio 
capabilities)

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Running Windows 7 XP jobs on ix hardware

2013-05-22 Thread Armen Zambrano G.

On 2013-05-17 1:07 PM, Armen Zambrano G. wrote:

* Windows XP on iX is running *hidden* on m-i, try and cedar


I didn't see any perma-oranges so I made them visible.

I will be adding the remaining branches tomorrow.

cheers,
Armen

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Running Windows 7 XP jobs on ix hardware

2013-05-17 Thread Armen Zambrano G.

Hi all,
As of today, we run in parallel Windows 7 and Windows XP on iX hardware 
as well as on minis.


Current status:
* Windows 7 on iX is running *visible* on FF23 based trees
* Windows XP on iX is running *hidden* on m-i, try and cedar

Sometime after Tuesday, I will:
* Evaluate and propose a date for Win7 on iX to take over rev3 minis
** I have to fix a couple of issues
* Make visible WinXP on iX (if it looks good)
** We will also enable it on all appropriate trees (if it looks good)

We're also increasing the Windows ix pools (xp/w7/w8) a 30% (130 nodes 
each).


If you find any blockers please add it to one of the bugs below.

best regards,
Armen

https://tbpl.mozilla.org/?tree=Mozilla-Inboundjobname=Windows%207
https://bugzilla.mozilla.org/show_bug.cgi?id=770578

https://tbpl.mozilla.org/?tree=Mozilla-Inboundjobname=Windows%20XP
https://bugzilla.mozilla.org/show_bug.cgi?id=770579
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Windows 7 test jobs running on iX

2013-05-15 Thread Armen Zambrano G.

These jobs are now running across the board up to mozilla-aurora.

It seems that dromaeo css on pgo is misbehaving:
https://bugzilla.mozilla.org/show_bug.cgi?id=870453

cheers,
Armen

On 2013-05-13 11:23 AM, Armen Zambrano G. wrote:

The intermittent failures on consecutive changesets were not as bad as I
first thought.

This week we should have the jobs be running across all branches.
IT should have the machines ready between Tuesday to Wednesday.

cheers,
Armen

On 2013-05-09 3:20 PM, Armen Zambrano G. wrote:

Hi all,
We have started running Windows 7 test jobs in our new iX hardware on
the Cedar tree. These machines will replace our current Rev3 minis.

The reason I'm reaching you out is because I have found some
intermittent failures on those machines. I need some help investigating
them. I'm afraid that they may happen too often that it will make
triaging the tree and inconvenience for sheriffs and developers.

* [1] Bug 820738 - Intermittent browser_newtab_block.js | grid status =
0,1,2,3,6,7,8,9, - Got
0,1,2,3,6,7,8,9,http://mochi.test:/browser/browser/base/content/test/test_wyciwyg_copying.html,

expected 0,1,2,3,6,7,8,9, (and another)
* [2] Bug 870485 - Intermittent reftest failure on Windows 7 on iX
machines
* [3] Bug 870490 - Intermittent mochitest-browser-chrome failure on
Windows 7 on iX machines
* [4] Bug 870488 - Intermittent xpcshell failure on Windows 7 on iX
hardware

I hope to have more machines next week and I will enable the jobs on
Mozilla-Inbound and other branches as well.

Thanks in advance,
Armen

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=820738
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=870485
[3] https://bugzilla.mozilla.org/show_bug.cgi?id=870490
[4] https://bugzilla.mozilla.org/show_bug.cgi?id=870488




___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Windows 7 test jobs running on iX

2013-05-13 Thread Armen Zambrano G.
The intermittent failures on consecutive changesets were not as bad as I 
first thought.


This week we should have the jobs be running across all branches.
IT should have the machines ready between Tuesday to Wednesday.

cheers,
Armen

On 2013-05-09 3:20 PM, Armen Zambrano G. wrote:

Hi all,
We have started running Windows 7 test jobs in our new iX hardware on
the Cedar tree. These machines will replace our current Rev3 minis.

The reason I'm reaching you out is because I have found some
intermittent failures on those machines. I need some help investigating
them. I'm afraid that they may happen too often that it will make
triaging the tree and inconvenience for sheriffs and developers.

* [1] Bug 820738 - Intermittent browser_newtab_block.js | grid status =
0,1,2,3,6,7,8,9, - Got
0,1,2,3,6,7,8,9,http://mochi.test:/browser/browser/base/content/test/test_wyciwyg_copying.html,
expected 0,1,2,3,6,7,8,9, (and another)
* [2] Bug 870485 - Intermittent reftest failure on Windows 7 on iX machines
* [3] Bug 870490 - Intermittent mochitest-browser-chrome failure on
Windows 7 on iX machines
* [4] Bug 870488 - Intermittent xpcshell failure on Windows 7 on iX
hardware

I hope to have more machines next week and I will enable the jobs on
Mozilla-Inbound and other branches as well.

Thanks in advance,
Armen

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=820738
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=870485
[3] https://bugzilla.mozilla.org/show_bug.cgi?id=870490
[4] https://bugzilla.mozilla.org/show_bug.cgi?id=870488


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Improving Mac OS X 10.6 test wait times by reducing 10.7 load

2013-04-26 Thread Armen Zambrano G.

Would we be able to go back to where we disabled 10.7 altogether?
Product (Asa in separate thread) and release drivers (Akeybl) were OK to 
the compromise of version specific test coverage being removed completely.


Side note: adding Mac PGO would increase the build load (Besides this we 
have to do a large PO as we expect Mac wait times to be showing up as 
general load increases).


Not all reducing load approaches are easy to implement (due to the way 
that buildbot is designed) and it does not ensure that we would reduce 
it enough. It's expensive enough to support 3 different versions of Mac 
as is without bringing 10.9 into the table. We have to cut things at times.


One compromise that would be easy to implement and *might* reduce the 
load is to disable all debug jobs for 10.7.


cheers,
Armen

On 2013-04-26 11:29 AM, Justin Lebar wrote:

As a compromise, how hard would it be to run the Mac 10.6 and 10.7
tests on m-i occasionally, like we run the PGO tests?  (Maybe we could
trigger them on the same csets as we run PGO; it seems like that would
be useful.)

On Fri, Apr 26, 2013 at 11:19 AM, Ryan VanderMeulen rya...@gmail.com wrote:

On 4/26/2013 11:11 AM, Justin Lebar wrote:


So what we're saying is that we are going to completely reverse our
previous tree management policy?



Basically, yes.

Although, due to coalescing, do you always have a full run of tests on
the tip of m-i before merging to m-c?



Yes. Note that we generally aren't merging inbound tip to m-c - we're taking
a known-green cset (including PGO tests).

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Improving Mac OS X 10.6 test wait times by reducing 10.7 load

2013-04-26 Thread Armen Zambrano G.
Just disabling debug and talos jobs for 10.7 should reduce more than 50% 
of the load on 10.7. That might be sufficient for now.


Any objections on this plan?
We can re-visit later on if we need more disabled.

cheers,
Armen

On 2013-04-26 11:50 AM, Armen Zambrano G. wrote:

Would we be able to go back to where we disabled 10.7 altogether?
Product (Asa in separate thread) and release drivers (Akeybl) were OK to
the compromise of version specific test coverage being removed completely.

Side note: adding Mac PGO would increase the build load (Besides this we
have to do a large PO as we expect Mac wait times to be showing up as
general load increases).

Not all reducing load approaches are easy to implement (due to the way
that buildbot is designed) and it does not ensure that we would reduce
it enough. It's expensive enough to support 3 different versions of Mac
as is without bringing 10.9 into the table. We have to cut things at times.

One compromise that would be easy to implement and *might* reduce the
load is to disable all debug jobs for 10.7.

cheers,
Armen

On 2013-04-26 11:29 AM, Justin Lebar wrote:

As a compromise, how hard would it be to run the Mac 10.6 and 10.7
tests on m-i occasionally, like we run the PGO tests?  (Maybe we could
trigger them on the same csets as we run PGO; it seems like that would
be useful.)

On Fri, Apr 26, 2013 at 11:19 AM, Ryan VanderMeulen rya...@gmail.com
wrote:

On 4/26/2013 11:11 AM, Justin Lebar wrote:


So what we're saying is that we are going to completely reverse our
previous tree management policy?



Basically, yes.

Although, due to coalescing, do you always have a full run of tests on
the tip of m-i before merging to m-c?



Yes. Note that we generally aren't merging inbound tip to m-c - we're
taking
a known-green cset (including PGO tests).

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform




___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Improving Mac OS X 10.6 test wait times by reducing 10.7 load

2013-04-26 Thread Armen Zambrano G.


On 2013-04-26 12:14 PM, Justin Lebar wrote:

Would we be able to go back to where we disabled 10.7 altogether?


On m-i and try only, or everywhere?


The initial proposal was for disabling everywhere.

We could leave 10.7 opt jobs running everywhere as a compromise and 
re-visit after I re-purpose the first batch of machines.


best regards,
Armen



On Fri, Apr 26, 2013 at 12:10 PM, Armen Zambrano G. arme...@mozilla.com wrote:

Just disabling debug and talos jobs for 10.7 should reduce more than 50% of
the load on 10.7. That might be sufficient for now.

Any objections on this plan?
We can re-visit later on if we need more disabled.

cheers,
Armen


On 2013-04-26 11:50 AM, Armen Zambrano G. wrote:


Would we be able to go back to where we disabled 10.7 altogether?
Product (Asa in separate thread) and release drivers (Akeybl) were OK to
the compromise of version specific test coverage being removed completely.

Side note: adding Mac PGO would increase the build load (Besides this we
have to do a large PO as we expect Mac wait times to be showing up as
general load increases).

Not all reducing load approaches are easy to implement (due to the way
that buildbot is designed) and it does not ensure that we would reduce
it enough. It's expensive enough to support 3 different versions of Mac
as is without bringing 10.9 into the table. We have to cut things at
times.

One compromise that would be easy to implement and *might* reduce the
load is to disable all debug jobs for 10.7.

cheers,
Armen

On 2013-04-26 11:29 AM, Justin Lebar wrote:


As a compromise, how hard would it be to run the Mac 10.6 and 10.7
tests on m-i occasionally, like we run the PGO tests?  (Maybe we could
trigger them on the same csets as we run PGO; it seems like that would
be useful.)

On Fri, Apr 26, 2013 at 11:19 AM, Ryan VanderMeulen rya...@gmail.com
wrote:


On 4/26/2013 11:11 AM, Justin Lebar wrote:



So what we're saying is that we are going to completely reverse our
previous tree management policy?




Basically, yes.

Although, due to coalescing, do you always have a full run of tests on
the tip of m-i before merging to m-c?



Yes. Note that we generally aren't merging inbound tip to m-c - we're
taking
a known-green cset (including PGO tests).

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform





___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Improving Mac OS X 10.6 test wait times by reducing 10.7 load

2013-04-26 Thread Armen Zambrano G.

On 2013-04-26 1:31 PM, Justin Lebar wrote:

I don't think I'm comfortable disabling this platform across the
board, or even disabling debug-only runs across the board.

As jmaher pointed out, there are platform differences here.  If we
disable this platform entirely, we lose visibility into rare but, we
seem to believe, possible events.


That was a python issue that was related to talos.
It was not a Firefox issue that would have only failed on a specific 
version of Mac.



It seems like the only reason to disable everywhere instead of only on
m-i/try (or running less frequently on m-i, like we do with PGO) is
that the former is easier to implement.  It seems like we're proposing
taking a lot of risk here to work around our own failings...

Yes, it is lot of work to try to change the way that buildbot works to 
try to optimize not-a-standard method of operations.
Just by doing jobs on PGO and not on every checkin it would make the 
10.7 platform less than the other versions.


I could also have not even started the thread trying to improve our wait 
times for 10.6 and when one day someone complained about wait times on 
rev4 I would say we can not buy more machines.


Just a little before on the thread you were asking go big or go home 
and asked to disable even 10.6 debug tests. I'm confused about the 
different messages.




On Fri, Apr 26, 2013 at 1:03 PM, Armen Zambrano G. arme...@mozilla.com wrote:


On 2013-04-26 12:14 PM, Justin Lebar wrote:


Would we be able to go back to where we disabled 10.7 altogether?



On m-i and try only, or everywhere?



The initial proposal was for disabling everywhere.

We could leave 10.7 opt jobs running everywhere as a compromise and re-visit
after I re-purpose the first batch of machines.

best regards,
Armen




On Fri, Apr 26, 2013 at 12:10 PM, Armen Zambrano G. arme...@mozilla.com
wrote:


Just disabling debug and talos jobs for 10.7 should reduce more than 50%
of
the load on 10.7. That might be sufficient for now.

Any objections on this plan?
We can re-visit later on if we need more disabled.

cheers,
Armen


On 2013-04-26 11:50 AM, Armen Zambrano G. wrote:



Would we be able to go back to where we disabled 10.7 altogether?
Product (Asa in separate thread) and release drivers (Akeybl) were OK to
the compromise of version specific test coverage being removed
completely.

Side note: adding Mac PGO would increase the build load (Besides this we
have to do a large PO as we expect Mac wait times to be showing up as
general load increases).

Not all reducing load approaches are easy to implement (due to the way
that buildbot is designed) and it does not ensure that we would reduce
it enough. It's expensive enough to support 3 different versions of Mac
as is without bringing 10.9 into the table. We have to cut things at
times.

One compromise that would be easy to implement and *might* reduce the
load is to disable all debug jobs for 10.7.

cheers,
Armen

On 2013-04-26 11:29 AM, Justin Lebar wrote:



As a compromise, how hard would it be to run the Mac 10.6 and 10.7
tests on m-i occasionally, like we run the PGO tests?  (Maybe we could
trigger them on the same csets as we run PGO; it seems like that would
be useful.)

On Fri, Apr 26, 2013 at 11:19 AM, Ryan VanderMeulen rya...@gmail.com
wrote:



On 4/26/2013 11:11 AM, Justin Lebar wrote:




So what we're saying is that we are going to completely reverse our
previous tree management policy?





Basically, yes.

Although, due to coalescing, do you always have a full run of tests
on
the tip of m-i before merging to m-c?



Yes. Note that we generally aren't merging inbound tip to m-c - we're
taking
a known-green cset (including PGO tests).

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform






___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform



___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Improving Mac OS X 10.6 test wait times by reducing 10.7 load

2013-04-26 Thread Armen Zambrano G.

After re-reading, I'm happy to disable just m-i/try for now.

Modifying to trigger *some* jobs on m-i through would be some decent 
amount of work (adding Mac pgo builders) but still different than normal 
operations and increase the 10.6/10.8 test load.


On 2013-04-26 1:31 PM, Justin Lebar wrote:

I don't think I'm comfortable disabling this platform across the
board, or even disabling debug-only runs across the board.

As jmaher pointed out, there are platform differences here.  If we
disable this platform entirely, we lose visibility into rare but, we
seem to believe, possible events.

It seems like the only reason to disable everywhere instead of only on
m-i/try (or running less frequently on m-i, like we do with PGO) is
that the former is easier to implement.  It seems like we're proposing
taking a lot of risk here to work around our own failings...

On Fri, Apr 26, 2013 at 1:03 PM, Armen Zambrano G. arme...@mozilla.com wrote:


On 2013-04-26 12:14 PM, Justin Lebar wrote:


Would we be able to go back to where we disabled 10.7 altogether?



On m-i and try only, or everywhere?



The initial proposal was for disabling everywhere.

We could leave 10.7 opt jobs running everywhere as a compromise and re-visit
after I re-purpose the first batch of machines.

best regards,
Armen




On Fri, Apr 26, 2013 at 12:10 PM, Armen Zambrano G. arme...@mozilla.com
wrote:


Just disabling debug and talos jobs for 10.7 should reduce more than 50%
of
the load on 10.7. That might be sufficient for now.

Any objections on this plan?
We can re-visit later on if we need more disabled.

cheers,
Armen


On 2013-04-26 11:50 AM, Armen Zambrano G. wrote:



Would we be able to go back to where we disabled 10.7 altogether?
Product (Asa in separate thread) and release drivers (Akeybl) were OK to
the compromise of version specific test coverage being removed
completely.

Side note: adding Mac PGO would increase the build load (Besides this we
have to do a large PO as we expect Mac wait times to be showing up as
general load increases).

Not all reducing load approaches are easy to implement (due to the way
that buildbot is designed) and it does not ensure that we would reduce
it enough. It's expensive enough to support 3 different versions of Mac
as is without bringing 10.9 into the table. We have to cut things at
times.

One compromise that would be easy to implement and *might* reduce the
load is to disable all debug jobs for 10.7.

cheers,
Armen

On 2013-04-26 11:29 AM, Justin Lebar wrote:



As a compromise, how hard would it be to run the Mac 10.6 and 10.7
tests on m-i occasionally, like we run the PGO tests?  (Maybe we could
trigger them on the same csets as we run PGO; it seems like that would
be useful.)

On Fri, Apr 26, 2013 at 11:19 AM, Ryan VanderMeulen rya...@gmail.com
wrote:



On 4/26/2013 11:11 AM, Justin Lebar wrote:




So what we're saying is that we are going to completely reverse our
previous tree management policy?





Basically, yes.

Although, due to coalescing, do you always have a full run of tests
on
the tip of m-i before merging to m-c?



Yes. Note that we generally aren't merging inbound tip to m-c - we're
taking
a known-green cset (including PGO tests).

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform






___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform



___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Improving Mac OS X 10.6 test wait times by reducing 10.7 load

2013-04-25 Thread Armen Zambrano G.

(please follow up through mozilla.dev.planning)

Hello all,
I have recently been looking into our Mac OS X test wait times which 
have been bad for many months and progressively getting worst.

Less than 80% of test jobs on OS X 10.6 and 10.7 are able to start
within 15 minutes of being requested.
This slows down getting tests results for OS X and makes tree closures 
longer if we have Mac OS X test back logs.
Unfortunately, we can't buy any more revision 4 Mac minis (they're not 
sold anymore) as Apple discontinues old hardware as new ones comes out.


In order to improve the turnaround time for Mac testing, we have to look 
into reducing our test load in one of these two OSes (both of them run 
on revision 4 minis).
We have over a third of our OS X users running 10.6. Eventually, down 
the road, we could drop 10.6 but we still have a significant amount of 
our users there; even though Mac stopped serving them major updates 
since July 2011 [1].


Our current Mac OS X distribution looks like this:
* 10.6 - 43%
* 10.7 - 30%
* 10.8 - 27%
OS X 10.8 is the only version that is growing.

In order to improve our wait times, I propose that we stop testing on 
tbpl per-checkin [2] on OS X 10.7 and re-purpose the 10.7 machines as 
10.6 to increase our capacity.


Please let us know if this plan is unacceptable and needs further 
discussion.


best regards,
Armen Zambrano - Mozilla's Release Engineering
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Adjusting Windows 64-bit per-checkin builds (not related to nightly builds)

2013-04-02 Thread Armen Zambrano G.

Since last Friday, we're only running win64 builds per check-in on [1]:
* mozilla-central
* try [2]

We generate nightly win64 updates on mozilla-central.

All win64 jobs are running hidden:
https://tbpl.mozilla.org/?jobname=WINNT%206.1%20x86-64showall=1

Our win32 builds are now not seeing delays to be processed.

best regards,
Armen

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=814009
[2] trychooser has been updated and this is the syntax:
try: -b o -p win64 -u none -t none

On 2013-03-27 1:19 PM, Armen Zambrano G. wrote:

Hi,
We're currently suffering lack of capacity on the win64 builders.
I noticed that we still run win64 dependent builds for Thunderbird 
Firefox. I would like to disable those since they cost approximately 1/3
of our load (win32 opt/debug  win64 opt).


...

best regards,
Armen



___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Adjusting Windows 64-bit per-checkin builds (not related to nightly builds)

2013-04-02 Thread Armen Zambrano G.

On 2013-03-27 1:29 PM, Benjamin Smedberg wrote:

On 3/27/2013 1:19 PM, Armen Zambrano G. wrote:


On another note, there could be a tree booked for win64 and move
nightly win64 users there (orthogonal to updating users to 32-bit
builds) since it would allow the community control which merges from
mozilla-central to take in (and back out from bad merges).
We could have dep builds running on it as well as on mozilla-central
[1][2].

Please let me know what you think of this second part of the post.

I don't think that an extra would be necessary or useful. So far, nobody
has volunteered to maintain the win64 port, so having an extra tree only
means that nobody would do merges and we wouldn't even have nightly
regression ranges when things broke.

--BDS

I believe alex_mayorga volunteered on the update on turning off 64-bit 
thread. Makoto Kato might be interested as well.


We have try support if anyone wants to fix issues.

best regards,
Armen


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Adjusting Windows 64-bit per-checkin builds (not related to nightly builds)

2013-03-27 Thread Armen Zambrano G.

Hi,
We're currently suffering lack of capacity on the win64 builders.
I noticed that we still run win64 dependent builds for Thunderbird  
Firefox. I would like to disable those since they cost approximately 1/3 
of our load (win32 opt/debug  win64 opt).


If anyone has a strong objection for this first part of the plan please 
let me know.




I am not touching nightly updates as from the thread it seems an open 
ended issue (automatically update people to the 32-bit builds) besides 
not being my forte.


On another note, there could be a tree booked for win64 and move nightly 
win64 users there (orthogonal to updating users to 32-bit builds) since 
it would allow the community control which merges from mozilla-central 
to take in (and back out from bad merges).

We could have dep builds running on it as well as on mozilla-central [1][2].

Please let me know what you think of this second part of the post.

best regards,
Armen


[1]
In reply to John O'Duinn [:joduinn] from comment #37)
 tweaking summary, per bsmedberg, we'll still generate 64bit windows 
builds

 as follows:

 1) generate 64bit builds on mozilla-central, but not on mozilla-inbound,
 m-a/m-b/m-r or any project branches
 ** 64bit builds would be nightly builds only, not per checkin builds

[2]
On 2012-12-21 4:43 PM, Benjamin Smedberg wrote:

After I announced my decision to disable 64-bit Windows nightlies, there
was significant negative feedback. After reviewing that feedback, and
consulting with Release Engineering, I believe that we can keep a set of
users happy by making a modification to the original plan.

Most importantly, it seems that there are users who regularly run into
the 4GB memory limits of 32-bit builds. These users often have hundreds
or even thousands of tabs. These users are using the 64-bit nightlies
not primarily to be part of our testing community, but because those
builds are the best product available.

At this point, the Mozilla project does not have the resources to
actively support this use case. Making these builds, however, is not a
significant burden on our Release Engineering group. Therefore I have
modified my original plan in the following way:

* Migrate all existing users of win64 nightly channel builds to the
win32 nightly channel builds via automatic update.
* Continue to build win64 Nightly builds and updates on the nightly
channel. Users who need the 64-bit builds will have to download it after
the migration point (date TBD).
** Change the default first-run and update page for win64 builds to
explain to users that they are not supported.
** Disable the crash reporter for win64 builds
** Enable click-to-play plugins by default in the win64 builds.
* Discontinue the win64 tests and on-checkin builds to reduce release
engineering load. By default, do not generate win64 builds on try.
* win64 builds will be considered a “tier 3” build configuration. [1]

We will continue to test the win32 builds and make sure that they work
well on both 32-bit and 64-bit versions of Windows. Specifically, all of
our testing on Windows 8 is planned to be done on the 64-bit version of
Windows 8.

I do hope that the projects and developers who are interested in win64
will work together to maintain this build configuration. I am interested
in hearing from volunteers who want to become the 64-bit build
maintainer. I will also set up a discussion list specifically for win64
issues, if that would be valuable.

--BDS

Please post followups to to mozilla.dev.apps.firefox

[1] https://developer.mozilla.org/en-US/docs/Supported_build_configurations





___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Running Windows 8 tests visibly on mozilla-central, mozilla-inbound and try

2013-03-19 Thread Armen Zambrano G.

Hi,
Besides three test suites (reftests, debug m-2 and debug m-o) all other 
jobs are running constantly green.


If you want to read more about it visit 
http://armenzg.blogspot.ca/2013/03/running-windows-8-tests-visibly-on-tbpl.html


best regards,
Armen
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Deployed _dumbwin32proc.py to all win7 and winxp machines

2013-03-15 Thread Armen Zambrano G.

Hi,
This morning I deployed a different version of _dumbwin32proc.py to all 
of our Win7 (talos-r3-w7-*) and WinXP machines (talos-r3-xp-*).


This different version allows the machines to kill processes that 
currently could not be killed. For instance, when we asked a job to be 
interrupted. It would simply not happen.


I have been watching the tree to see if I can catch any regression for 
those 2 pools of machines. If you see anything out of the ordinary 
please let me know in bug 798170 [1].


So far, this change (if it works as expected) should improve our wait 
times. For instance, jobs that failed to kill a process and instead it 
times out OR when a job would get canceled but the job would still run 
all the way. Let's hope that this works as expected and see better wait 
times for those 2 pools.


Next week I will do the same for the Win64 builders and the Win8 machines.

Best regards,
Armen [2]


[1] https://bugzilla.mozilla.org/show_bug.cgi?id=798170
[2] https://mozillians.org/en-US/u/armenzg
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Windows 8 and our continuous integration system (tbpl)

2012-11-06 Thread Armen Zambrano G.
Hi all,
IT and RelEng are working on creating a new test infrastructure to make
obsolete our current Rev3  machines. To better meet our needs, we want
to make sure  that we test Firefox 32-bit on the right set of platforms.
Mozilla currently has continuous integration running (on tbpl.mozilla.org)
for unit tests and talos for the 32-bit versions of Windows  XP and Windows 7.
We currently do not run tests on the 64-bit version of these platforms [1].

We now are planning on adding the 64-bit version of Windows 8 to our continuous
integration matrix. The proposal is to run Firefox 32-bit on Windows XP 32-bit,
Windows 7 32-bit and Windows 8 64-bit.

We would like to  know if there are any reasons to also run tests on
Windows 8 *32-bit*. This does not mean that Mozilla would not qualify that
Firefox *32-bit* works properly on Windows 8 *32-bit* through various manual
QA and testing methods; it means we would not run automated tests per developer 
checkin on this platform. The main reason for skipping Windows 8 32-bit is to
avoid the costs of maintaining another two hundred machines for our continuous
integration for this platform.

In our testing and due diligence phase, we believe that we will not lack any
coverage by only testing on *64-bit* windows 8. If this is an incorrect
assessment please let us know. We're aware that there might be edge cases where 
some tests would not catch things for Windows 8 *32-bit*. This is a risk that
we are willing to take the same way that we have been comfortable with only
running tests on Win7 *32-bit* and not on Win7 *64-bit* version (not that past
decisions should blind us).

I hope this makes sense and please let us  know if there are any gotchas or you
have any questions.

Best regards,
Armen Zambrano
Mozilla's Release Engineering


PS  = Creating a new test infrastructure for Linux is also coming but it is off
  topic for this  thread.
PPS = This post is not about changing our system requirements [2].


[1] We also have five Windows 7 64-bit machines which were left around just in
case Firefox *64-bit* became important again and we would not need to start
from  scratch.
[2] http://www.mozilla.org/en-US/firefox/system-requirements.html
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Windows 8 and our continuous integration system (tbpl)

2012-11-06 Thread Armen Zambrano G.
This was discussed on the Platform meeting without any objections:
https://wiki.mozilla.org/Platform/2012-11-06#Roundtable

Should I make more noise about this so it does not get missed?
Should I blog? I sometimes fear how the press can misinterpret our posts.

FTR, asa, ehsan and jimm have in direct threads said that this plan is good.

On Tuesday, November 6, 2012 9:00:29 AM UTC-5, Armen Zambrano G. wrote:
 Hi all,
 
 IT and RelEng are working on creating a new test infrastructure to make
 
 obsolete our current Rev3  machines. To better meet our needs, we want
 
 to make sure  that we test Firefox 32-bit on the right set of platforms.
 
 Mozilla currently has continuous integration running (on tbpl.mozilla.org)
 
 for unit tests and talos for the 32-bit versions of Windows  XP and Windows 7.
 
 We currently do not run tests on the 64-bit version of these platforms [1].
 
 
 
 We now are planning on adding the 64-bit version of Windows 8 to our 
 continuous
 
 integration matrix. The proposal is to run Firefox 32-bit on Windows XP 
 32-bit,
 
 Windows 7 32-bit and Windows 8 64-bit.
 
 
 
 We would like to  know if there are any reasons to also run tests on
 
 Windows 8 *32-bit*. This does not mean that Mozilla would not qualify that
 
 Firefox *32-bit* works properly on Windows 8 *32-bit* through various manual
 
 QA and testing methods; it means we would not run automated tests per 
 developer checkin on this platform. The main reason for skipping Windows 8 
 32-bit is to
 
 avoid the costs of maintaining another two hundred machines for our continuous
 
 integration for this platform.
 
 
 
 In our testing and due diligence phase, we believe that we will not lack any
 
 coverage by only testing on *64-bit* windows 8. If this is an incorrect
 
 assessment please let us know. We're aware that there might be edge cases 
 where some tests would not catch things for Windows 8 *32-bit*. This is a 
 risk that
 
 we are willing to take the same way that we have been comfortable with only
 
 running tests on Win7 *32-bit* and not on Win7 *64-bit* version (not that past
 
 decisions should blind us).
 
 
 
 I hope this makes sense and please let us  know if there are any gotchas or 
 you
 
 have any questions.
 
 
 
 Best regards,
 
 Armen Zambrano
 
 Mozilla's Release Engineering
 
 
 
 
 
 PS  = Creating a new test infrastructure for Linux is also coming but it is 
 off
 
   topic for this  thread.
 
 PPS = This post is not about changing our system requirements [2].
 
 
 
 
 
 [1] We also have five Windows 7 64-bit machines which were left around just in
 
 case Firefox *64-bit* became important again and we would not need to 
 start
 
 from  scratch.
 
 [2] http://www.mozilla.org/en-US/firefox/system-requirements.html

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: New backout policy for Ts regressions on mozilla-inbound

2012-10-19 Thread Armen Zambrano G.

Is there a place where this will be document?
I would like to keep an eye on what gets added or point people at it.

This is awesome!
Looking forward to see how it goes.

thanks Ehsan

On 2012-10-18 2:05 PM, Ehsan Akhgari wrote:

Hi everyone,

As part of our efforts to get more value out of the Talos test suite for
preventing performance regressions, we believe that we are now ready to put
a first set of measures against startup time regressions.  We will start by
imposing a new backout policy for mozilla-inbound checkins for regressions
more than 4% on any given platform.  If your patch falls in a range which
causes more than 4% Ts regression, it will be backed out by our sheriffs
together with the rest of the patches in that range, and you can only
reland after you fix the regression by testing locally or on the try server.

The 4% threshold has been chosen based on anecdotal evidence on the most
recent Ts regressions that we have seen, and is too generous, but we will
be working to improve the reporting and regression detection systems
better, and as those get improved, we would feel more comfortable with
imposing this policy on other Talos tests with tighter thresholds.

Please let me know if you have any questions.

Cheers,
--
Ehsan
http://ehsanakhgari.org/



___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


  1   2   >