[Felix-language] fbuild

john skaller Wed, 08 Dec 2010 18:13:47 -0800

This is about fbuild not felix .. it contains a lot of my opinion on build 
systems.


The first thing to note is: make sucks. Totally. It is completely unprincipled.

What make does is: there is a dependency tree, and attributes on the nodes
of the tree, primarily the leaves, consist of build rules. The idea is simple 
enough:

if A depends on B, A and B must be files and the date stamps are compared to see
if A is newer than B. If not, the build rule is invoked.

Although the dependency tree is declarative, the build rules are not: they're 
executable
and there only coupling between the target and the build rule is "gratuitous", 
that is,
an accident of the programmer using the right command.

The structure cannot cope with non-file targets, and it can't cope with build 
commands
that produce multiple outputs. It also can't cope with dynamic changes, nor with
recursion. Furthermore, the trees can't easily be decomposed (there's lots of 
stuff
about how bad recursive make is).

It would seem the idea of "make target" is a good one but it is not. There is 
rarely
any need to make anything much other than the whole system, or at least a 
significant
part of it, eg: make config, make stuff, make tests, make docs: these things 
make
sense, but they're not actually targets.

There's a hidden, KILLER bug in the make concept as well: the idea you can
actually express all the dependencies. In the 1970's this may have been the
case, since really building C programs was the only thing of interest..
for modern systems commands are generative: for example a doxygen pass
on a suite of code generates a lot of files.

The *correct* way to construct a build system is to give up the idea that
it is a functional, declarative thing. It's not. Build systems are executable,
imperative code. Imperative implies sequenced, so there's no need to
mess around stating dependencies, all you need to do is get the build
order correct. Once. And fix it up if things change. [Within this concept
you can still use dependency checking for subparts, but that checking
will be *specialised* for example, using ocamldep to build ocaml code
in the right order .. this is unrelated to the overall build system 
architecture]

With an imperative program, it is easy to have a few switches and do:

// raw options
if "config" in options: do_config=true;
if "test" in options: do_test=true;
...
// calculate dependencies  between phases
if do_test: do_config=true;
..
// do the work:

if do_config: do_the_config();
if do_build: do_the_build(); 
...

These switches represent gross concepts like building documentation,
or running tests: they're not micro-targets, they're commands.

So given all this, it is easy to see how to write build scripts: they're
programs. It's that simple! And the programming language is a full scale
one like Python or whatever, not crud like "make" or "autoconf" or other
rubbish mini-languages lacking in proper structure and capability.

but there's a problem: rebuilding everything when you make a change is SLOW.

Sure, it should usually get the right results without stating dependencies .. 
but few
programmers can drink that much coffee and live ... 

Erick has come up with a brilliant solution to this problem.

Actually, it's a very well known old fashioned solution everyone uses
everywhere .. 

Caching. By writing functions with input and outputs made explicit,
(including files) it is possible to cache the result of applying a function.
So instead of invoking the C compiler, fbuild can just use the cached
value if it exists and is up-to date. The "up to date-ness" is calculated
based on a a small set of inputs and outputs for that function.

Failing to cache something is not a sin, it has no impact on semantics,
it just isn't optimal. The build will be slower than required.

And if you change the build script itself? Fbuild can cache that too.

There is a subtle core difference from make here. Make takes a dependency
tree and does a recursive descent until it finds a leaf (source file) and a 
target,
and checks dates, running the build commands if required to update the target:
using the date stamps is a crude way to do caching.

But the system is driven top down. This is the BIG MISTAKE.

Build systems should work the other way. They should work UP from modified
source files, building everything than is will be changed.

You may say: but it's the same! Recursive descent IS working up!

Yes, it is,  provided the dependencies are complete and correct,
it is working up because that is the only way. The point is you don't need
the dependencies. Instead you just note all the source files that change,
and the re-execute the steps that use them. That changes more files,
so the process continues until nothing changes .. note in passing this
will handle recursion by achieving a fixpoint.

The trick here is to *capture* the dependencies as you're working up:
they're the outputs of the build functions. The point is that if the inputs
do NOT change, you can calculate these outputs fast using a cache.

This means you can just run the whole build process every time,
and it will be fast. 

the main caveat here is similar to make: if you cache something you have
to capture all the significant inputs and outputs.

the difference is: this is a local decision, not a global requirement.
You don't have to cache anything.

It's a very clever solution. Nice work Erick!


--
john skaller
skal...@users.sourceforge.net





------------------------------------------------------------------------------
This SF Dev2Dev email is sponsored by:

WikiLeaks The End of the Free Internet
http://p.sf.net/sfu/therealnews-com
_______________________________________________
Felix-language mailing list
Felix-language@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/felix-language

[Felix-language] fbuild

Reply via email to