Re: Policy for exposing range structs

Steven Schveighoffer via Digitalmars-d Thu, 31 Mar 2016 06:16:40 -0700

On 3/30/16 3:19 PM, Liran Zvibel wrote:

On Sunday, 27 March 2016 at 17:01:39 UTC, David Nadlinger wrote:

Compression in the usual sense won't help. Sure, it might reduce the
object file size, but the full string will again have to be generated
first, still requiring absurd amounts time and space. The latter is
definitely not negligible for symbol names of several hundreds of
kilobytes; it shows up prominently in compiler profiles of affected
Weka builds.


We love Voldemort types at Weka, and use them a lot in our
non-gc-allocating ranges and algorithm libraries. Also, we liberally
nest templates inside of other templates.
I don't think we can do many of the things we do if we had to define
everything at module level. This flexibility is amazing for us and part
of the reason we love D.

Voldemort types are what cause the bloat, templates inside templatesaren't as much of a problem. It's because the Voldemort type has toinclude in its symbol name at least twice, and I think 3 times actually(given the return type), the template parameter/function parameter typesof the function it resides in. If the template is just a template, it'sjust included once. This is why moving the type outside the function iseffective at mitigation. It's linear growth vs. exponential.

I too like Voldemort types, but I actually found moving the typesoutside the functions quite straightforward. It's just annoying to haveto repeat the template parameters. If you make them private, then youcan simply avoid all the constraints. It's a bad leak of implementation,since now anything in the file has access to that type directly, butit's better than the issues with voldemort types.

See the update to my iopipe library here:https://github.com/schveiguy/iopipe/commit/1b0696dc82fce500c6b314ec3d8e5e11e0c1bcd7

This one commit made my example program 'convert'(https://github.com/schveiguy/iopipe/blob/master/examples/convert/convert.d)save over 90% binary size (went from 10MB to <1MB).

This also calmed down some REALLY horrible stack traces when I wasdebugging. As in, I could actually understand what function it wastalking about, and it didn't take 10 seconds to print stack trace.


But, as David said -- it comes with a great price for us.

I just processed our biggest executable, and came up with the following
numbers:
total symbols: 99649
Symbols longer than 1k: 9639
Symbols longer than 500k: 102
Symbols longer than 1M: 62. The longest symbols are about 5M bytes!

This affects our exe sizes in a terrible way, and also increases our
compile and link times considerably. I will only be able to come up with
statistics of how much time was wasted due to too-long-symbols after we
fix it, but obviously this is a major problem for us.

From my testing, it doesn't take much to get to the point where thelinker is unusable. A simple struct when nested in 15 calls to afunction makes the linker take an unreasonable amount of time (over 1.5minutes, I didn't wait to see how long). See my bug report for details.

Another factor in the name length is the module name which is includedin every type and function. So you have a factor like 3^15 for the name,but then you multiply this by the module names as well.

I think we should try the solution proposed by Anon, as it has a good
possibility of saving quite a bit.
It's important to make sure that when a template is given as a template
parameter, the complete template is treated as the LName.

I hope this is given serious thought, looks like someone has alreadystarted implementation.

Anon, it appears that your mechanism has been well received by a fewknowledgeable people here. I encourage you to solidify your proposal ina DIP (D improvement proposal) here: http://wiki.dlang.org/DIPs.


-Steve

Re: Policy for exposing range structs

Reply via email to