On Friday, 14 July 2017 at 22:45:44 UTC, H. S. Teoh wrote:
Here's a further update to the saga of combating ridiculously
large symbol sizes.
So yesterday I wrote a new module that also heavily uses UFCS
chains. My initial draft of the module, once I linked it with
the main program, particularly with a UFCS chain that has led
to the 600MB executable sizes seen above, caused another
explosion in symbol size that actually managed to reach 100MB
in *one* symbol, triggering a DMD termination complaining about
possible infinite template recursion. :-D Funnier still,
temporarily simplifying part of the chain to bring symbol sizes
down, I eventually got it below 100MB but ended up with linker
segfaults and ELF errors because the huge symbol was too
ridiculously huge.
Eventually, it drove me to refactor two Phobos functions that
are used heavily in my code: std.range.chunks and
std.algorithm.joiner, using the same "horcrux" technique (see
Phobos PRs #5610 and #5611). This, together with some further
refactoring in my own code, eventually brought things down to
the 20MB range of executable sizes.
Then an idea occurred to me: the reason these symbol sizes got
so large, was because the UFCS chain preserves *all* type
information about every component type used to build the final
type. So, by necessity, the type name has to somehow encode all
of this information in an unambiguous way. Now, arguably,
DMD's current mangling scheme is at fault because it contains
too many repeating components, but even if you disregard that,
the fact remains that if you have 50+ components in your
overall UFCS chain, the symbol length cannot be less than 50*n
where n is the average length of a single component's type
name, plus some necessary overhead to account for the mangling
scheme syntax. Let's say n is on average 20-25 characters, say
round it up to 35 for mangling syntax, so you're still looking
at upwards of 1700+ characters *minimum*. That, coupled with
the current O(n^2) / O(n^3) mangling scheme, you easily reach
megabytes of symbol length. We can compress the symbols all we
want, but there's a limit as to how much compression will help.
At the end of the day, you still have to represent those 50+
components *somehow*.
But what if most of this type information is actually
*unnecessary*? To use a range example, if all you care about
at the end of the chain is that it's a forward range of ubyte,
then why even bother with multi-MB symbols encoding type
information that's never actually used? Maybe a little
type-erasure would help, via hiding those 50+ component types
behind an opaque runtime OO polymorphic interface. Phobos does
have the facilities of this, in the form of the InputRange,
ForwardRange, etc., interfaces in std.range.interfaces. In my
case, however, part of the chain uses another generic type (a
kind of generalization of 2D arrays). But either way, the idea
is simple: at the end of the UFCS chain, wrap it in a class
object that inherits from a generic interface that encodes only
the primitives of the 2D array concept, and the element type.
The class object itself, of course, still must retain knowledge
of the full-fledged type, but the trick is that if we insert
this type erasure in the middle of the chain, then later
components don't have to encode the type names of earlier
components anymore.
T
I have some stupid questions:
- What does everyone mean when they say 'symbol' here? I'm
probably misunderstanding symbols gravely or it's something that
DMD in particular handles in a strange way.
- What type information are being kept because of UFCS chains?
Doesn't that mechanism simply apply overload resolution then
choose between the prefix and .method forms as appropriate,
rewriting the terms?
Then it's a problem of function invocation. I don't get
what's happening here still. Does this tie to the Voldemort types
problem? (=> are many of the functions in the chain returning
custom types?) It doesn't make sense, especially if, from your
indirection workaround, it looks like it would work around the
same but without the bloat problem if we unlinked the chain into
many intermediate temporary parts. So how is this a type
information issue?
Thanks!