Re: Program size, linking matter, and static this()

Steven Schveighoffer Fri, 16 Dec 2011 11:25:23 -0800

On Fri, 16 Dec 2011 13:29:18 -0500, Andrei Alexandrescu<seewebsiteforem...@erdani.org> wrote:

Hello,
Late last night Walter and I figured a few interesting tidbits ofinformation. Allow me to give some context, discuss them, and sketch afew approaches for improving things.
A while ago Walter wanted to enable function-level linking, i.e. onlyget the needed functions from a given (and presumably large) module. Sohe arranged things that a library contains many small object "files"(that actually are generated from a single .d file and never exist ondisk, only inside the library file, which can be considered an archivelike tar). Then the linker would only pick the used object "files" fromthe library and link those in. Unfortunately that didn't have nearly theexpected impact - essentially the size of most binaries stayed the same.The mystery was unsolved, and Walter needed to move on to other things.
One particularly annoying issue is that even programs that don'tostensibly use anything from an imported module may balloon inexplicablyin size. Consider:
import std.path;
void main(){}
This program, after stripping and all, has some 750KB in size. Removingthe import line reduces the size to 218KB. That includes the runtimesupport, garbage collector, and such, and I'll consider it a baseline.(A similar but separate discussion could be focused on reducing thebaseline size, but herein I'll consider it constant.)
What we'd simply want is to be able to import stuff without blatantlypaying for what we don't use. If a program imports std.path and uses nofunction from it, it should be as large as a program without the import.Furthermore, the increase should be incremental - using 2-3 functionsfrom std.path should only increase the executable size by a little, notsuddenly link in all code in that module.
But in experiments it seemed like program size would increase in suddenamounts when certain modules were included. After much investigation wefigured that the following fateful causal sequence happened:
1. Some modules define static constructors with "static this()" or"static shared this()", and/or static destructors.
2. These constructors/destructors are linked in automatically whenever amodule is included.
3. Importing a module with a static constructor (or destructor) willgenerate its ModuleInfo structure, which contains static informationabout all module members. In particular, it keeps virtual table pointersfor all classes defined inside the module.
4. That means generating ModuleInfo refers all virtual functions definedin that module, whether they're used or not.
5. The phenomenon is transitive, e.g. even if std.path has no staticconstructors but imports std.datetime which does, a ModuleInfo isgenerated for std.path too, in addition to the one for std.datetime. Sonow classes inside std.path (if any) will be all linked in.
6. It follows that a module that defines classes which in turn use otherfunctions in other modules, and has static constructors (or includesother modules that do) will baloon the size of the executable suddenly.
There are a few approaches that we can use to improve the state ofaffairs.
A. On the library side, use static constructors and destructorssparingly inside druntime and std. We can use lazy initializationinstead of compulsively initializing library internals. I think this isoften a worthy thing to do in any case (dynamic libraries etc) becauseit only does work if and when work needs to be done at the small cost ofa check upon each use.
B. On the compiler side, we could use a similar lazy initializationtrick to only refer class methods in the module if they're actuallyneeded. I'm being vague here because I'm not sure what and how that canbe done.

I disagree with this assessment. It's good to know the cause of theproblem, but let's look at the root issue -- reflection. The only reasonto include class information for classes not being referenced is to beable to construct/use classes at runtime instead of at compile time. Butif you look at D's runtime reflection capabilities, they are quite poor.You can only construct a class at runtime if it has a zero-arg constructor.

So essentially, we are paying the penalty of having runtime reflection interms of bloat, but get very very little benefit.


I think there are two things that need to be considered:

1. We eventually should have some reasonably complete runtime reflectioncapability2. Runtime reflection and shared libraries go hand-in-hand. With sharedlibrary support, the bloat penalty isn't nearly as significant.

I don't think the right answer is to avoid using features of the languagebecause the compiler/runtime has some design deficiencies. At some pointthese deficiencies will be fixed, and then we are left with a library thathas seemingly odd design choices that we can't change.


-Steve

Re: Program size, linking matter, and static this()

Reply via email to