Hi,

I've watched you guys thinking for an hour ;-)

Some comments from me.

Yes to moving build flows that generate assets into the tree.
Yes to having a way for developers to reproduce what automation does.
Yes to having jobs being executed more on demand than on push, and having that have idempotent results.

Sceptical on the vision that we'll see the end of inbounds. The interactions between test results and rebase don't seem to be trivial enough to me to hope for non-backout always-open trees via auto land.

I'm having an 'oh noes' for single command called by automation. My main point here is the usefulness of logs generated. When you put all sequential and parallel tasks into one single wrapper process, you end up with one big log file on ftp, like today. And if anything happens, one needs to read that log and reverse engineer which characters in this log are stdout/stderr, and to which task they belong. I know I can't tell good from bad in our logs.

OTH, you could have all the structure of the process being exposed in the automation and its reporting. If something goes wrong, you can tell the location of the problem in the process right away, you can drill down to the process task, and its dependencies.

If I think of the problem, I'm thinking along these lines: Let's specify the process, as a DAG of serialized and parallized tasks, inside the tree, and have the automation run that as is (*). Offer developers a console-only hook to that fragment of the complete automation process, akin to integration tests.

* while using buildbot, parallel tasks would need to be executed sequentially. I read the recent posts by Taras et al that buildbot isn't a solid requirement going forward.

A few comments on mozharness. One of the earliest tasks it offered, IIRC, was multi-locale android builds. Sadly, it happens that it's not helping those developers that want to create and test multi-locale builds. It's monolithic deliverable isn't what developers need at the point when they test multi-locale builds, nor does it blend in to the developer's setup. Folks like rnewman were glad once I explained how to avoid using mozharness for their builds. To me that's a sign of an inadequate level of abstraction.

And, as it's been mentioned all over the call, l10n repacks:

Testing: repacks are hard to test, and they should be. They're designed to be infallible, so that, no matter what happens in a localization, they're producing runnable builds. A test is challenged to tell between a broken localization and a broken build system. We shouldn't overestimate the amount of errors in the build that end up in a build bustage, and which of those are actually test failures. And which are not generating build failures, but are bustages. One example would be broken locale merge dirs, for example. Anything can be in those, and the builds build and run fine. They're just not showing the right strings.

More generally, repacks are basically unowned at this point. There's a bit of ownership in build, in releng, and me, as to how they're done. There's absolutely nothing as far as reporting goes. The agreement between John and me was "if there's anything odd, file a bug on releng to dig in".

That's as much as I can get out of my brain into writing, I wish I had an hour-long video to go back and forth about stuff ;-)

Axel

On 2/28/14, 9:48 PM, Gregory Szorc wrote:
(This is likely off-topic for many dev-platform readers. I was advised
to post here because RelEng monitors dev-platform and I don't like
cross-posting.)

The technical interaction between build automation and mozilla-central
has organically grown into something that's very difficult to maintain
and improve. There is no formal API between automation and
mozilla-central. As a result, we have automation calling into esoteric
or unsupported commands and make targets. Change is difficult because it
must be coordinated with automation changes. Build system maintainers
lack understanding of what is and isn't used in automation. It's
difficult to reproduce what automation does locally.

The current approach slows everyone down, leads to too-frequent breakage
(l10n repacks are a great example), and limits the efficiency of
automation.

I'm elated to state that at a meeting earlier today, we worked out a
solution to these problems! Full details are in bug 978211.

tl;dr we are going to marshal all interaction between automation and the
tree through a mach-like in-tree script. This script will establish a
clear, supported, and auditable API for automation tasks and will
establish a level of indirection allowing the tree to change without
requiring corresponding changes to automation.

Some of the benefits of this approach include:

* Abstracting the build backend away from automation. The tree will
choose GNU make, pymake, mozmake, Tup, etc depending on what it knows is
best. Currently, automation has {make, pymake, mozmake} hard-coded.

* Allowing the build system to execute more efficiently. Currently,
automation executes compile, symbol generation, and packaging as
separate steps. This change opens the door to moving all these steps
into the core build system's DAG so they can run concurrently, allowing
your build jobs to complete sooner.

* Better support for l10n repacks. They have been a constant headache
for everyone who's touched them. This opens the door to moving more of
the logic into the tree, in a more well-defined API.

* A lot of us want to kill client.mk. Having automation not directly
calling it will allow us to finally do this.

* Clearer identification of problems and responsibilities. Currently,
when something like l10n repacks break, it's not clear if it was due to
a change in the tree or in automation or even what change caused the
regression! Sadly, lots of fingerpointing and "not my problem" tends to
ensue. This change will establish clearer borders and thus lead to
easier and better resolutions.

Please follow bug 978211 for updates.

_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to