Walter Bright wrote:
What you can try is creating a database that is basically a lib (call it A.lib) of all the modules compiled with -lib. Then recompile all modules that depend on changed modules in one command, also with -lib, call it B.lib. Then for all the obj's in B, replace the corresponding ones in A.

OK, there we go: http://h3.team0xf.com/increBuild2.7z // I hope it's fine to include LIBUNRES here. It's just for convenience.

This is the second incarnation of that incremental build tool experiment. This time it uses -lib instead of -multiobj, as suggested by Walter.

The algorithm works as follows:

* compile modules to a .lib file
* extract objects with static ctors or the __Dmain function (remove them from the lib)
* find out which old object files should be replaced
        * any objects whose any symbols were re-generated in this compilation 
pass
* pack up the obsoleted object files into a 'junk' library
* prepend the 'junk' library to the /library chain/
* prepend the newly compiled library to the /library chain/
* link the executable by passing the cached object files and the whole library chain to the linker

It doesn't use the simple approach of having just one 'junk'/'A.lib' library and appending objects to it, because that's pretty slow due to the librarian having to re-generate the dictionary at each such operation. So instead it keeps a chain of all libraries generated in this process and passes them to the linker in the right order. This will waste more space than the naive approach, but should be faster.

The archive contains the source code and a compiled binary (DMD-Win only for now... Sorry, folks) as well as a little test in the test/ directory. It shows how naive incremental compilation fails (break.bat) and how this tool works (work.bat).

The tool can be used with the latest Mercurial revision of xfBuild ( http://bitbucket.org/h3r3tic/xfbuild/ ) by passing "+cincreBuild" to it. The support is a massive hack though, so expect some strangeness.

I was able to run it on the 'Test1' demo of my Hybrid GUI ( http://team0xf.com:1024/hybrid/file/c841d95675ca/Test1.d ) and a simple/dumb ray tracer based on OMG ( http://team0xf.com:1024/omg/file/5199ed783490/Tracer.d ). In incremental compilation it's not noticeably slower than the naive approach, however DMD consumes more memory in the -lib mode and the executables produced by this approach are larger for some reason. For instance, with Hybrid, Test1.exe has about 20MB with increBuild, compared to about 5MB with the traditional approach. Perhaps there's some simple way to remove this bloat, as compressed with UPX even with the fastest compression method the executables differ by just a few kilobytes.

When building my second largest project, DMD eats up about 1.2GB of memory and dies (even without -g). Luckily, xfBuild allows me to set the limit of modules to be compiled at a time, so when I cap it to 200, it compiled... but didn't link :( Somewhere in the process a library is created that confuses OPTLINK as well as "lib -l". There's one symbol in it that neither of these are unable to see and it results in an undefined reference when linking. The symbol is clearly there when using a lib dumping tool from DDL or "libunres -d -c". I've dropped the lib at http://h3.team0xf.com/strangeLib.7z . The symbol in question is compressed and this newsgroup probably won't chew the non-ansi chars well, but it can be found via a regex "D2xf3omg4core.*ctFromRealVee0P0Z".

One thing slowing this tool down is the need to call the librarian multiple times. DMD -lib will sometimes generate multiple objects with the same name and you can only extract them (when using the librarian) by running lib -x multiple times. DMD should probably be patched up to include fully qualified module names in objects instead of just the last name (foo.Mod and bar.Mod both yield Mod.obj in the library), as -op doesn't seem to help here.

Another idea that will map well onto any incremental builder would be to write a tool that will find the differences between modules and tell whether e.g. they're limited to function bodies. Then an incremental builder could assume that it doesn't have to recompile any dependencies, just this one modified file. Unfortunately, this assumption doesn't always hold - functions could be used via CTFE to generate code, thus the changes escape. Personally I'm of the opinion that functions should be explicitly marked for CTFE, and this is just another reason for such. I'm using a patched DMD with added pragma(ctfe) which instructs the compiler not to run any codegen or generate debug info functions/aggregates marked as such. This trick alone can slim an executable down by a good megabyte, which sometimes is a life-saver with OPTLINK. I've been hearing that other people put their CTFE stuff into .di files, but this approach doesn't cover all cases of codegen via CTFE and string mixins.

I'm afraid I won't be doing any other prototypes shortly - I really need to focus on my master's thesis :P But then, I don't really know how this tool can be improved without hacking the compiler or writing custom OMF processing.


--
Tomasz Stachowiak
http://h3.team0xf.com/
h3/h3r3tic on #D freenode

Reply via email to