On Oct 9, 2017, at 11:49, Zero King wrote:

> Did you trigger a force build [1] on the wrong builder? I started
> another one [2] on 10.13.
> 
> [1] https://build.macports.org/builders/ports-10.12_x86_64-watcher/builds/8476
> [2] https://build.macports.org/builders/ports-10.13_x86_64-watcher/builds/427

It was intentional, but didn't have the effect I expected.

The High Sierra builder has been busy building ports as commits come in. It 
builds and deploys binaries not only for the port being modified, but also for 
all of its dependencies recursively. This is a good way to get started out, so 
that we provide binaries for things that are actively being updated.

During lulls, I've been scheduling small batches of additional ports to be 
built. For example I built all the gcc and clang compilers because they are 
large and time-consuming to compile; I did all the postgresql and mysql 
databases; I did xorg since that would recursively build all xorg ports; and I 
did all the third-party php modules since there are just a few of them and 
they're mostly standalone. Then I moved on to batches of perl modules, python 
modules and libraries beginning with a, then b, then c, etc. Each small batch 
takes some hours to do, but afterward builds from commits can resume. In this 
way, builds from commits are somewhat delayed, but not too much. So that's one 
reason why I try to schedule small batches of ports rather than large ones, but 
there's another reason too.

The buildbot (specifically, portwatcher) is smart enough that, when it is asked 
to build a batch of ports, it only schedules (portbuilder) builds for those 
ports and subports which haven't already been built. Then it works through that 
set of scheduled builds before moving on to the next batch. The check for 
whether a port is already built happens only once per batch (in portwatcher), 
before any port from that batch is built, not at any other time during that 
batch (i.e. not in portbuilder).

For this reason, it is more efficient to build small batches of ports at a 
time. Imagine a new builder which has not yet built any ports, and you want it 
to build zlib and xz. zlib has an extract dependency on xz, and xz has a few 
library dependencies. zlib also has a subport, minizip, which has dependencies 
on several ports including zlib.

Suppose you schedule a single batch and list both zlib and xz in the portlist 
field. This can be inefficient. First, buildbot will check if zlib or minizip 
or xz have already been built. They haven't, so all three ports are scheduled 
to be built. Currently, builds are inadvertently scheduled in somewhat random 
order [1], so it could be that the minizip build occurs first. It will build 
its dependencies first, which include zlib and xz, then build minizip, and then 
all installed active ports will be deactivated. Finally, all the binaries will 
be distributed. Maybe the zlib build got scheduled second, and it installs 
(well, reactivates) its dependencies. It then wants to build zlib, except it's 
already been built, so it just reactivates the already-built zlib. It then 
deactivates all installed ports. All the binaries have already been 
distributed, so nothing needs to be done there either. This build was a waste 
of time, since all the work this build would have done was already done by the 
previous build. Finally we get to xz which happened to be scheduled third. Much 
the same thing happens: dependencies reactivate, xz reactivates, everything 
deactivates, nothing needs to be distributed. Waste of time.

Now suppose that instead, you schedule two batches of ports: First zlib, then 
xz. The first batch proceeds similar to the above, with two builds being 
scheduled, one for zlib and one for minizip. Sure, there's still some wasted 
time there, if minizip goes first. But when it comes time for the second batch, 
portwatcher realizes xz has already been built, and does not schedule a build 
for it. Time saved.

You can imagine that this gets worse the more ports you schedule at once, and 
the more interdependencies they have. And we have lots of python and perl 
modules, and they tend to have lots of interdependencies. So your forced build 
of all python modules on the High Sierra builder scheduled 2400 builds, which 
have so far taken 68 hours, during which no builds from commits could take 
place. We can let it run, because it's almost done now, and we do need to 
attempt those builds sooner or later, but I suspect many of those builds could 
have been avoided if they had been scheduled in smaller batches.

So why then did I schedule a build of all python modules on the Sierra builder? 
We have never scheduled a build of all ports on the Sierra builder, because 
when reimplementing the buildbot after moving away from macOS forge we never 
added the capability to just type the word "all" into the portlist field, and 
because I suspect attempting to build all ports would take a much longer time 
than on the old system, on which it already took weeks. We have a ticket [2] 
about adding that capability, and I wanted to see what happened if I requested 
to build a larger-than-usual number of ports that was still fewer than all 
ports. I expected that since the Sierra builder has existed for about a year 
already, most python modules would already have been built due to updates or as 
dependencies, so I expected maybe a few hundred builds to get scheduled.

What I found was that our implementation of checking whether ports have already 
been built is inefficient and slow [3]. And I was surprised that 2500 builds 
got scheduled, which have been building for the past 100 hours. Maybe tons of 
our python modules are unused and don't get updated. Probably part of the 
problem was that I had redelivered hook messages that GitHub said failed to 
deliver [4] (though they probably did deliver) and as a result, some ports that 
had been built got completely uninstalled when mpbb inadvertently believed they 
were too old. (They were, in fact, "too new" -- newer than the old commit whose 
hook message was redelivered.)

Some Sierra binaries are being produced, but not as many as you might hope with 
2500 scheduled builds. Maybe most of the ports being built here aren't 
distributable.

I have some ideas for changes to buildbot/mpbb which might help. I'll bring 
those up later.


[1] https://trac.macports.org/ticket/52766

[2] https://trac.macports.org/ticket/52989

[3] https://trac.macports.org/ticket/55041

[4] https://trac.macports.org/ticket/54648

Reply via email to