Re: [tools-dev] Re: Building OpenOffice.org with GNU make

Mathias Bauer Thu, 21 Jan 2010 00:51:29 -0800

Jussi Pakkanen wrote:

> On Wed, Jan 13, 2010 at 5:21 PM, bjoern michaelsen - Sun Microsystems
> - Hamburg Germany <bjoern.michael...@sun.com> wrote:
> 
>> Mu. As long as you have a recursive build process, you have a lot of
>> implicit dependencies and those are hurting parallelization. On top of
>> that, process instantiation and file-I/O is very, very expensive on
>> Windows, the slowest and most problematic platform of all (both are used
>> extensively when doing recursion). This might not matter too much for a
>> full build that package maintainers usually do. But it matters very
>> much for the change->rebuild->change-cycle that devs usually do (Both
>> implicit deps and recursion over mostly noop build tasks really hurt
>> here).
> 
> There has been much speculation on the performances but very little
> hard facts. 
It's not only about performance, I still wonder if and how CMake (or
better: the CMake+GNU Make+MSBuild combo) is able to support the "all in
one process" approach. So far I can only see that with CMake we either
can copy the current iterative approach (build.pl) or use a recursive one.


So, how can we implement "include, not execute" with CMake?

But IMHO we are talking too much about performance. Let me recall the
even more important requirement that caused me to think about changes in
the first place: consistency and reliability.

Building a single target currently requires maintaining several files,
at least a build.lst, a d.lst and several makefile.mk files (others like
makefile.pmk etc. put aside for the moment). It's easy to forget
something, but even if you don't: it's a PITA how many files you have to
edit if you just move some files around. I'm doing some refactoring ATM
and it really sucks.

So it doesn't surprise that the current system is very susceptible to
errors. Let me explain this in a simple example.

Consider that you have the modules A, B, C and D. D depends on B and C,
that both depend on A. This is reflected by corresponding entries in the
build.lst files of these modules. Let's assume what happens if I forget
to add the dependency on A in the build.lst from C.

When I do a complete build from scratch for module D, it will work if B
is built before C as building B will also build A, so C got its
precondition met and will build fine also. If C is built before B, the
build will break as it will miss A.

Now consider the ~200 modules we have and the multitude of dependencies
that they have. It's easy to understand that in case a particular
dependency is missing, it depends on the number of modules that are
built, the number of processes used (-Pn) and maybe other circumstances
if your build breaks or runs and if the result is correct. So a build
might work well for those who made some changes, but can break for
others later on after the changes have been integrated.

How can one detect such a missing dependency? The only way I see is call
"build --all" with an empty solver for each and every module to get a
consistent list in the first place and in future repeat this for every
module that got its dependencies changed. This is not sufficient to
support all flavors of build.pl, but at least it is sufficient for
"build" and "build --all". Is it realistic to expect that everybody will
do that? Wouldn't it be better if the error detection is "built in"?

Let's consider the "single process" make. Now A,B,C and D are single
targets (e.g. libraries), not modules, but that doesn't make a
difference. The idea is to have one makefile for each target that can be
used to build this target in case all preconditions are met (files are
present in solver), exactly like you do a "build $(MODULE)" currently.

(Sidestep: in fact we will plan to have a makefile that includes the
makefiles of all targets in one module, to support the current working
style of our developers. But that isn't important for now.)

For a complete build there will be another makefile that includes the
makefiles of all targets. Using this "super makefile" makes a big
difference for the build stability. Regardless if you do a full or a
partial build (like e.g. an incompatible build from vcl onwards), the
same makefile is used always, containing all dependencies.

A missing dependency declaration now means that the "super makefile"
does not include the makefile that explains how a particular
precondition can be built in case it isn't there (usually meaning: in
the solver). In fact a missing dependency now becomes a missing rule.

Different to the build.pl approach a missing rule/dependency means that
it is not present anywhere - it does not happen that a build for "B"
knows how to build "A", but a build for "C" doesn't know it. Thus the
masking of missing dependencies that I have described above is not
there. Moreover, a missing rule will be detected even before the build
starts, so mistakes are found faster: it is sufficient to call "make"
with an empty solver to detect the missing rules.

So unless you can't bear the overhead of evaluating all dependencies all
the time (what hopefully will take less time than running through all
modules today), you can always just call "make" after your changes
anywhere in the code and the result will be OK. (I've put aside for now
what make will produce at the end: just a solver, packages/installation
sets or a runnable office instance somewhere - that doesn't matter here.)

Of course it is still possible to create wrong makefiles, but a very
important error (that happens quite easily in our current system) will
be removed.

What does that tell us about CMake? If CMake does not allow us to use a
"super makefile" including all other makefiles, we don't get the build
stability improvements that the GNU Make approach can give us. In case
we can have a "super makefile" that *executes* the others (recursive
approach) we have at least an improvement because this file will be the
replacement for all build.lst files. But still the only way to find
missing dependencies/rules is executing the build and hope that the
errors are not masked.

So again: how can we implement "include, not execute" with CMake?

Regards,
Mathias

-- 
Mathias Bauer (mba) - Project Lead OpenOffice.org Writer
OpenOffice.org Engineering at Sun: http://blogs.sun.com/GullFOSS
Please don't reply to "nospamfor...@gmx.de".
I use it for the OOo lists and only rarely read other mails sent to it.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tools.openoffice.org
For additional commands, e-mail: dev-h...@tools.openoffice.org

Re: [tools-dev] Re: Building OpenOffice.org with GNU make

Reply via email to