On Monday, 19 August 2013 at 20:23:46 UTC, H. S. Teoh wrote:
With D's honestly awesome metaprogramming features, templates
are liable
to be (and in fact are) used a LOT. This leads to the
unfortunate
situation of template bloat: every time you instantiate a
template, it
adds yet another copy of the templated code into your object
file. This
gets worse when you use templated structs/classes, each of
which may
define some number of methods, and each instantiation adds yet
another
copy of all those methods.
This is doubly bad if these templates are used only during
compile-time,
and never referenced during runtime. That's a lot of useless
baggage in
the final executable. Plus, it leads to issues like this one:
http://d.puremagic.com/issues/show_bug.cgi?id=10833
While looking at this bug, I got an idea: what if, instead of
emitting
template instantiations into the same object file as
non-templated code,
the compiler were to emit each instantiation into a separate
static
*library*? For instance, if you have code in program.d, then the
compiler would emit non-templated code like main() into
program.o, but
all template instantiations get put in, say, libprogram.a. Then
during
link time, the compiler runs `ld -oprogram program.o
libprogram.a`, and
then the linker will pull in symbols from libprogram.a that are
referenced by program.o.
If we were to set things up so that libprogram.a contains a
separate
unit for each instantiated template function, then the linker
would
actually pull in only code that is actually referenced at
runtime. For
example, say our code looks like this:
struct S(T) {
T x;
T method1(T t) { ... }
T method2(T t) { ... }
T method3(T t) { ... }
}
void main() {
auto sbyte = S!byte();
auto sint = S!int();
auto sfloat = S!float();
sbyte.method1(1);
sint.method2(2);
sfloat.method3(3.0);
}
Then the compiler would put main() in program.o, and *nothing
else*. In
program.o, there would be undefined references to
S!byte.method1,
S!int.method2, and S!float.method3, but not the actual code.
Instead,
when the compiler sees S!byte, S!int, and S!float, it puts all
of the
instantiated methods inside libprogram.a as separate units:
libprogram.a:
struct_S_byte_method1.o:
S!byte.method1
struct_S_byte_method2.o:
S!byte.method2
struct_S_byte_method3.o:
S!byte.method3
struct_S_int_method1.o:
S!int.method1
struct_S_int_method2.o:
S!int.method2
struct_S_int_method3.o:
S!int.method3
struct_S_float_method1.o:
S!float.method1
struct_S_float_method2.o:
S!float.method2
struct_S_float_method3.o:
S!float.method3
Since the compiler doesn't know at instantiation time which of
these
methods will actually be used, it simply emits all of them and
puts them
into the static library.
Then at link-time, the compiler tells the linker to include
libprogram.a
when linking program.o. So the linker goes through each
undefined
reference, and resolves them by linking in the module in
libprogram.a
that defines said reference. So it would link in the code for
S!byte.method1, S!int.method2, and S!float.method3. The other 6
instantiations are not linked into the final executable,
because they
are never actually referenced by the runtime code.
So this way, we minimize template bloat to only the code that's
actually
used at runtime. If a particular template function
instantiation is only
used during CTFE, for example, it would be present in
libprogram.a but
won't get linked, because none of the runtime code references
it. This
would fix bug 10833.
Is this workable? Is it implementable in DMD?
T
Without link-time optimisation, this prevents inlining doesn't it?