Re: Policy for exposing range structs

Liran Zvibel via Digitalmars-d Wed, 30 Mar 2016 12:22:45 -0700

On Sunday, 27 March 2016 at 17:01:39 UTC, David Nadlinger wrote:

Compression in the usual sense won't help. Sure, it mightreduce the object file size, but the full string will againhave to be generated first, still requiring absurd amounts timeand space. The latter is definitely not negligible for symbolnames of several hundreds of kilobytes; it shows up prominentlyin compiler profiles of affected Weka builds.

We love Voldemort types at Weka, and use them a lot in ournon-gc-allocating ranges and algorithm libraries. Also, weliberally nest templates inside of other templates.I don't think we can do many of the things we do if we had todefine everything at module level. This flexibility is amazingfor us and part of the reason we love D.


But, as David said -- it comes with a great price for us.

I just processed our biggest executable, and came up with thefollowing numbers:

total symbols: 99649
Symbols longer than 1k: 9639
Symbols longer than 500k: 102

Symbols longer than 1M: 62. The longest symbols are about 5Mbytes!

This affects our exe sizes in a terrible way, and also increasesour compile and link times considerably. I will only be able tocome up with statistics of how much time was wasted due totoo-long-symbols after we fix it, but obviously this is a majorproblem for us.

I think we should try the solution proposed by Anon, as it has agood possibility of saving quite a bit.It's important to make sure that when a template is given as atemplate parameter, the complete template is treated as the LName.

Thinking about the compression idea by Andrei, I think we getsuch long names since we have huge symbols that are being passedas Voldemort names to template parameters. Then we repeat thehuge symbols several times in the new template.Think of a .5M symbol passed few times to a template, this isprobably how we get to 5M size symbols.This could end up being too complex, but if we assign "huffmancoding" like names to the complete template names in a modulescope (lets say, only if the template name is longer than 30bytes), we then will be able to replace a very long string by thehuffman coded version coupled with the LName+Number idea above,we will be able to shorten symbol names considerably.

An initial implementation could start with just the LName#solution, and then we can see if we also have to recursivelycouple it with huffman-coding of the results template names.


Liran

Re: Policy for exposing range structs

Reply via email to