16.12.2011 21:29, Andrei Alexandrescu пишет:
Hello,


Late last night Walter and I figured a few interesting tidbits of
information. Allow me to give some context, discuss them, and sketch a
few approaches for improving things.

A while ago Walter wanted to enable function-level linking, i.e. only
get the needed functions from a given (and presumably large) module. So
he arranged things that a library contains many small object "files"
(that actually are generated from a single .d file and never exist on
disk, only inside the library file, which can be considered an archive
like tar). Then the linker would only pick the used object "files" from
the library and link those in. Unfortunately that didn't have nearly the
expected impact - essentially the size of most binaries stayed the same.
The mystery was unsolved, and Walter needed to move on to other things.

One particularly annoying issue is that even programs that don't
ostensibly use anything from an imported module may balloon inexplicably
in size. Consider:

import std.path;
void main(){}

This program, after stripping and all, has some 750KB in size. Removing
the import line reduces the size to 218KB. That includes the runtime
support, garbage collector, and such, and I'll consider it a baseline.
(A similar but separate discussion could be focused on reducing the
baseline size, but herein I'll consider it constant.)

What we'd simply want is to be able to import stuff without blatantly
paying for what we don't use. If a program imports std.path and uses no
function from it, it should be as large as a program without the import.
Furthermore, the increase should be incremental - using 2-3 functions
from std.path should only increase the executable size by a little, not
suddenly link in all code in that module.

But in experiments it seemed like program size would increase in sudden
amounts when certain modules were included. After much investigation we
figured that the following fateful causal sequence happened:

1. Some modules define static constructors with "static this()" or
"static shared this()", and/or static destructors.

2. These constructors/destructors are linked in automatically whenever a
module is included.

3. Importing a module with a static constructor (or destructor) will
generate its ModuleInfo structure, which contains static information
about all module members. In particular, it keeps virtual table pointers
for all classes defined inside the module.

4. That means generating ModuleInfo refers all virtual functions defined
in that module, whether they're used or not.

5. The phenomenon is transitive, e.g. even if std.path has no static
constructors but imports std.datetime which does, a ModuleInfo is
generated for std.path too, in addition to the one for std.datetime. So
now classes inside std.path (if any) will be all linked in.

6. It follows that a module that defines classes which in turn use other
functions in other modules, and has static constructors (or includes
other modules that do) will baloon the size of the executable suddenly.

There are a few approaches that we can use to improve the state of affairs.

A. On the library side, use static constructors and destructors
sparingly inside druntime and std. We can use lazy initialization
instead of compulsively initializing library internals. I think this is
often a worthy thing to do in any case (dynamic libraries etc) because
it only does work if and when work needs to be done at the small cost of
a check upon each use.

B. On the compiler side, we could use a similar lazy initialization
trick to only refer class methods in the module if they're actually
needed. I'm being vague here because I'm not sure what and how that can
be done.

Here's a list of all files in std using static cdtors:

std/__fileinit.d
std/concurrency.d
std/cpuid.d
std/cstream.d
std/datebase.d
std/datetime.d
std/encoding.d
std/internal/math/biguintcore.d
std/internal/math/biguintx86.d
std/internal/processinit.d
std/internal/windows/advapi32.d
std/mmfile.d
std/parallelism.d
std/perf.d
std/socket.d
std/stdiobase.d
std/uri.d

The majority of them don't do a lot of work and are not much used inside
phobos, so they don't blow up the executable. The main one that could
receive some attention is std.datetime. It has a few static ctors and a
lot of classes. Essentially just importing std.datetime or any std
module that transitively imports std.datetime (and there are many of
them) ends up linking in most of Phobos and blows the size up from the
218KB baseline to 700KB.

Jonathan, could I impose on you to replace all static cdtors in
std.datetime with lazy initialization? I looked through it and it
strikes me as a reasonably simple job, but I think you'd know better
what to do than me.

A similar effort could be conducted to reduce or eliminate static cdtors
from druntime. I made the experiment of commenting them all, and that
reduced the size of the baseline from 218KB to 200KB. This is a good
amount, but not as dramatic as what we can get by working on std.datetime.


Thanks,

Andrei

Really sorry, but it sounds silly for me. It's a minor problem. Does anyone really cares about 600 KiB (3.5x) size change in an empty program? Yes, he does, but only if there is no other size increases in real programs.



Now dmd have at least _two order of magnitude_ file size increase. I posted that problem four months ago at "Building GtkD app on Win32 results in 111 MiB file mostly from zeroes".

An example of this bug is in archive:
http://deoma-cmd.ru/files/other/gtkD-1.5.1-size.7z

Built version (with *.exe and *.lib files):
http://deoma-cmd.ru/files/other/gtkD-1.5.1-size-built.7z


Detailed description:
GtkD is built using singe (gtk-one-obj.lib) or separate (one per source file) object files (gtk-sep-obj.lib).

Than main.d that imports gtk.Main is built using those libraries.

Than zeroCount utils is built and launched over resulting files:
--------------------------------------------------
Now let's calculate zero bytes counts:
--------------------------------------------------
  Zero bytes|     %|    Non-zero| Total bytes|        File
     3628311| 21.56|    13202153|    16830464|gtk-one-obj.lib
     1953124| 15.98|    10272924|    12226048|gtk-sep-obj.lib
   127968798| 99.00|     1298430|   129267228|main-one-obj.exe
      743821| 37.51|     1239183|     1983004|main-sep-obj.exe
Done.

So we have to use very slow per-file build to produce a good (not 100 MiB) executable. No matter what *.exe is launched, its process allocates ~20MiB of RAM (loaded Gtk dll-s).



The second dmd issue (that was discovered because of 99.00% of zeros) is that _it doesn't use bss section_.
Lets look at the C++ program built using Microsoft's cl:
---
char arr[1024 * 1024 * 10];
void main() { }
---
It resultis in ~10KiB executable, because `arr` is initialized with zero bytes and put in bss section. If one of its elements is set to non-zero:
---
char arr[1024 * 1024 * 10] = { 1 };
void main() { }
---
The array can't be in .bss any more and resulting executable size will be increased by adding ~10MiB. The following D program results in ~10MiB executable:
---
ubyte[1024 * 1024 * 10] arr;
void main() { }
---
So, if there really is a reason not to use .bss, it should be clearly explained.



If described issues aren't much more significant than "static this()", show me where am I wrong, please.

Reply via email to