Re: Possible solution to template bloat problem?

John Colvin Mon, 19 Aug 2013 15:16:10 -0700

On Monday, 19 August 2013 at 20:23:46 UTC, H. S. Teoh wrote:

With D's honestly awesome metaprogramming features, templatesare liableto be (and in fact are) used a LOT. This leads to theunfortunatesituation of template bloat: every time you instantiate atemplate, itadds yet another copy of the templated code into your objectfile. Thisgets worse when you use templated structs/classes, each ofwhich maydefine some number of methods, and each instantiation adds yetanother
copy of all those methods.
This is doubly bad if these templates are used only duringcompile-time,and never referenced during runtime. That's a lot of uselessbaggage in
the final executable. Plus, it leads to issues like this one:

        http://d.puremagic.com/issues/show_bug.cgi?id=10833
While looking at this bug, I got an idea: what if, instead ofemittingtemplate instantiations into the same object file asnon-templated code,the compiler were to emit each instantiation into a separatestatic
*library*? For instance, if you have code in program.d, then the
compiler would emit non-templated code like main() intoprogram.o, butall template instantiations get put in, say, libprogram.a. Thenduringlink time, the compiler runs `ld -oprogram program.olibprogram.a`, and
then the linker will pull in symbols from libprogram.a that are
referenced by program.o.
If we were to set things up so that libprogram.a contains aseparateunit for each instantiated template function, then the linkerwouldactually pull in only code that is actually referenced atruntime. For
example, say our code looks like this:

        struct S(T) {
                T x;
                T method1(T t) { ... }
                T method2(T t) { ... }
                T method3(T t) { ... }
        }
        void main() {
                auto sbyte  = S!byte();
                auto sint   = S!int();
                auto sfloat = S!float();

                sbyte.method1(1);
                sint.method2(2);
                sfloat.method3(3.0);
        }
Then the compiler would put main() in program.o, and *nothingelse*. Inprogram.o, there would be undefined references toS!byte.method1,S!int.method2, and S!float.method3, but not the actual code.Instead,when the compiler sees S!byte, S!int, and S!float, it puts allof the
instantiated methods inside libprogram.a as separate units:

        libprogram.a:
                struct_S_byte_method1.o:
                        S!byte.method1
                struct_S_byte_method2.o:
                        S!byte.method2
                struct_S_byte_method3.o:
                        S!byte.method3
                struct_S_int_method1.o:
                        S!int.method1
                struct_S_int_method2.o:
                        S!int.method2
                struct_S_int_method3.o:
                        S!int.method3
                struct_S_float_method1.o:
                        S!float.method1
                struct_S_float_method2.o:
                        S!float.method2
                struct_S_float_method3.o:
                        S!float.method3
Since the compiler doesn't know at instantiation time which ofthesemethods will actually be used, it simply emits all of them andputs them
into the static library.
Then at link-time, the compiler tells the linker to includelibprogram.awhen linking program.o. So the linker goes through eachundefinedreference, and resolves them by linking in the module inlibprogram.a
that defines said reference. So it would link in the code for
S!byte.method1, S!int.method2, and S!float.method3. The other 6
instantiations are not linked into the final executable,because they
are never actually referenced by the runtime code.
So this way, we minimize template bloat to only the code that'sactuallyused at runtime. If a particular template functioninstantiation is onlyused during CTFE, for example, it would be present inlibprogram.a butwon't get linked, because none of the runtime code referencesit. This
would fix bug 10833.

Is this workable? Is it implementable in DMD?


T


Without link-time optimisation, this prevents inlining doesn't it?

Re: Possible solution to template bloat problem?

Reply via email to