On Tuesday, 24 May 2016 at 16:22:56 UTC, Timon Gehr wrote:
It's _exponential_ growth. We don't even want to spend the time and memory required to generate the strings.

The reason we have this discussion is that the worst case isn't rare enough to make this argument. Why compress in the first place if mangled names don't grow large in practice?

Since Walter is evidently still having a hard time understanding this, I've done a few more pathological cases, comparing LZ4 to Back Referencing Named Types.

For BRNT I manually replaced struct names in the mangling with N<number> numeric identifiers for all but one of their appearances, to simulate what could actually be done by the compiler. No other mangling changes (e.g., for eponymous templates or hidden types) were applied.

auto foo(T)(T x)
{
    static struct Vold { T payload; }
    return Vold(x);
}


At a chain of length 10:
foo(5).foo.foo.foo.foo.foo.foo.foo.foo.foo

current: Generate a 57_581 byte symbol
lz4 -9: Generate a 57_581 byte symbol, then compress it to 412 bytes lzma -9: Generate a 57_581 byte symbol, then compress it to 239 bytes
BRNT: Generate a 332 byte symbol


Upping it to twelve levels:
foo(5).foo.foo.foo.foo.foo.foo.foo.foo.foo.foo.foo

current: Generate a 230_489 byte symbol
lz4 -9: Generate a 230_489 byte symbol, then compress it to 1128 bytes lzma -9: Generate a 230_489 byte symbol, then compress it to 294 bytes
BRNT: Generate a 393 byte symbol

Upping it to fourteen levels:

I'm too lazy to do more manual BRNT, so beyond this point its numbers are estimated based on the previously established fact that BRNT symbols grow linearly.

current: Generate a 922_127 byte symbol
lz4 -9: Generate a 922_127 byte symbol, then compress it to 4012 bytes
lzma -9: Generate a 922_127 byte symbol, compress it to 422 bytes
BRNT: Generate a ~457 byte symbol

Upping it to sixteen levels:

current: Generate a 3_688_679 byte symbol
lz4 -9: Generate a 3_688_679 byte symbol, then compress it to 15535 bytes lzma -9: Generate a 3_688_679 byte symbol, then compress it to 840 bytes
BRNT: Generate a ~527 byte symbol



I want to let that sink in: in the general case, BRNT beats even **LZMA**.



As if winning the compactness war while still being a symbol linkers won't have problems with wasn't enough, this approach also avoids generating giant symbols in the first place. Compression and/or hashing cannot do that. If D really wants to, it can compress symbols, but it *needs* to fix this problem properly *first*.

Reply via email to