Jussi Pakkanen wrote: > On Wed, Jan 13, 2010 at 5:21 PM, bjoern michaelsen - Sun Microsystems > - Hamburg Germany <bjoern.michael...@sun.com> wrote: > >> Mu. As long as you have a recursive build process, you have a lot of >> implicit dependencies and those are hurting parallelization. On top of >> that, process instantiation and file-I/O is very, very expensive on >> Windows, the slowest and most problematic platform of all (both are used >> extensively when doing recursion). This might not matter too much for a >> full build that package maintainers usually do. But it matters very >> much for the change->rebuild->change-cycle that devs usually do (Both >> implicit deps and recursion over mostly noop build tasks really hurt >> here). > > There has been much speculation on the performances but very little > hard facts. It's not only about performance, I still wonder if and how CMake (or better: the CMake+GNU Make+MSBuild combo) is able to support the "all in one process" approach. So far I can only see that with CMake we either can copy the current iterative approach (build.pl) or use a recursive one.
So, how can we implement "include, not execute" with CMake? But IMHO we are talking too much about performance. Let me recall the even more important requirement that caused me to think about changes in the first place: consistency and reliability. Building a single target currently requires maintaining several files, at least a build.lst, a d.lst and several makefile.mk files (others like makefile.pmk etc. put aside for the moment). It's easy to forget something, but even if you don't: it's a PITA how many files you have to edit if you just move some files around. I'm doing some refactoring ATM and it really sucks. So it doesn't surprise that the current system is very susceptible to errors. Let me explain this in a simple example. Consider that you have the modules A, B, C and D. D depends on B and C, that both depend on A. This is reflected by corresponding entries in the build.lst files of these modules. Let's assume what happens if I forget to add the dependency on A in the build.lst from C. When I do a complete build from scratch for module D, it will work if B is built before C as building B will also build A, so C got its precondition met and will build fine also. If C is built before B, the build will break as it will miss A. Now consider the ~200 modules we have and the multitude of dependencies that they have. It's easy to understand that in case a particular dependency is missing, it depends on the number of modules that are built, the number of processes used (-Pn) and maybe other circumstances if your build breaks or runs and if the result is correct. So a build might work well for those who made some changes, but can break for others later on after the changes have been integrated. How can one detect such a missing dependency? The only way I see is call "build --all" with an empty solver for each and every module to get a consistent list in the first place and in future repeat this for every module that got its dependencies changed. This is not sufficient to support all flavors of build.pl, but at least it is sufficient for "build" and "build --all". Is it realistic to expect that everybody will do that? Wouldn't it be better if the error detection is "built in"? Let's consider the "single process" make. Now A,B,C and D are single targets (e.g. libraries), not modules, but that doesn't make a difference. The idea is to have one makefile for each target that can be used to build this target in case all preconditions are met (files are present in solver), exactly like you do a "build $(MODULE)" currently. (Sidestep: in fact we will plan to have a makefile that includes the makefiles of all targets in one module, to support the current working style of our developers. But that isn't important for now.) For a complete build there will be another makefile that includes the makefiles of all targets. Using this "super makefile" makes a big difference for the build stability. Regardless if you do a full or a partial build (like e.g. an incompatible build from vcl onwards), the same makefile is used always, containing all dependencies. A missing dependency declaration now means that the "super makefile" does not include the makefile that explains how a particular precondition can be built in case it isn't there (usually meaning: in the solver). In fact a missing dependency now becomes a missing rule. Different to the build.pl approach a missing rule/dependency means that it is not present anywhere - it does not happen that a build for "B" knows how to build "A", but a build for "C" doesn't know it. Thus the masking of missing dependencies that I have described above is not there. Moreover, a missing rule will be detected even before the build starts, so mistakes are found faster: it is sufficient to call "make" with an empty solver to detect the missing rules. So unless you can't bear the overhead of evaluating all dependencies all the time (what hopefully will take less time than running through all modules today), you can always just call "make" after your changes anywhere in the code and the result will be OK. (I've put aside for now what make will produce at the end: just a solver, packages/installation sets or a runnable office instance somewhere - that doesn't matter here.) Of course it is still possible to create wrong makefiles, but a very important error (that happens quite easily in our current system) will be removed. What does that tell us about CMake? If CMake does not allow us to use a "super makefile" including all other makefiles, we don't get the build stability improvements that the GNU Make approach can give us. In case we can have a "super makefile" that *executes* the others (recursive approach) we have at least an improvement because this file will be the replacement for all build.lst files. But still the only way to find missing dependencies/rules is executing the build and hope that the errors are not masked. So again: how can we implement "include, not execute" with CMake? Regards, Mathias -- Mathias Bauer (mba) - Project Lead OpenOffice.org Writer OpenOffice.org Engineering at Sun: http://blogs.sun.com/GullFOSS Please don't reply to "nospamfor...@gmx.de". I use it for the OOo lists and only rarely read other mails sent to it. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tools.openoffice.org For additional commands, e-mail: dev-h...@tools.openoffice.org