Re: DMD producing huge binaries

Georgi D via Digitalmars-d Fri, 20 May 2016 11:37:53 -0700

On Friday, 20 May 2016 at 16:21:55 UTC, ZombineDev wrote:

On Friday, 20 May 2016 at 13:16:32 UTC, Johan Engelen wrote:
On Friday, 20 May 2016 at 12:57:40 UTC, ZombineDev wrote:
As I said earlier, it would be best if can prevent thegeneration of long symbols in the first place, because thatwould improve the compilation times significantly.
From what I've observed, generating the long symbol nameitself is fast. If we avoid the deep type hierarchy, then Ithink indeed you can expect compile time improvement.
Walter's PR slows down the compilation with 25-40% accordingto my tests. I expect that compilation would be faster if thewhole process is skipped altogether.
MD5 hashing slowed down builds by a few percent for Weka(note: LDC machinecodegen is slower than DMD's, sopercentage-wise...), which can then be compensated for usingPGO ;-) /+ <-- shameless PGO plug +/
IIUC, your scheme works like this:
1. DMDFE creates a mangled symbol name
2. Create a MD-5 hash of thr symbol use the hash instead of thefull name.
If minimal change in Georgi's almost trivial program w.r.t LoC(compared to e.g. Weka's huge codebase) increases compilationtime from 1.5sec to 27sec, I can't imagine how slower it wouldtake for a larger project.
We should cure root cause. Genetating uselessly large symbols(even if hashed after that) is part of that problem, so I thinkit should never done if they start growing than e.g. 500 bytes.
The solution that Steven showed is exactly what the compilershould do. Another reason why the compiler should do it isbecause often voldemort types capture outer context (localvariables, alias paramters, delegates, etc.), which makes itvery hard for the user to manually extract the voldemort typeout of the function.

I see two separate issues that I think should be handledindependently:


1) Exponential growth of symbol name with voldemort types.

I like Steven's solution where the compiler lowers the structoutside of the method.

2) Long symbol names in general which could arise even withoutvoldemort types involved especially with chaining multiplealgorithms.

I like Johan Engelen solution in LDC for symbols longer than athreshold. For symbols shorter than the threshold I thinkWalter's compression algorithm could be used to gain 40-60%reduction and still retain the full type information.

Re: DMD producing huge binaries

Reply via email to