Re: Compiler scalability. Question inspired by OOM errors seen by Crystal language

2017-08-31 Thread Stefan Koch via Digitalmars-d

On Thursday, 31 August 2017 at 17:39:35 UTC, H. S. Teoh wrote:



This is just one example off the top of my head; I'm sure there 
are plenty of others that we can come up with once we break 
free of the C++ mold of thinking of templates as "copy-n-paste 
except substitute X with Y".  Another example that just 
occurred to me is the various recursive templates in std.meta 
for manipulating AliasSeq's.  There is actually no need for the 
compiler to instantiate anything at all, except perhaps for the 
final result. By treating the AliasSeq as an in-compiler data 
type with compile-time operations performed on it, the compiler 
ought to be able to evaluate the template without needing to 
instantiate anything past the first level of recursion.



T


That is a good example and it is one where we could maybe deduce 
what is doing on.
However the classification of a template (rather the deduction of 
unused intanciations) actually requires us to instantiate the 
template.
Because we cannot predict to what it might evaluate given 
specific parameters.
For this to work we have to produce all possible instantiation 
behaviors of a template, and deduplicate this set.

Which happens to be infinite, in most cases.
Even predicting if the set of instantiations can be finite is 
essentially an instance of the halting problem.


I've worked on this for some time until I realized that I was 
fighting without gaining ground.


It would end up a myriad of special cases and would be impossible 
to maintain.


Considering that the current template system which does not try 
anything smart is already fiendishly complicated, it'd rather not 
introduce an even more complicated 
template-instance-shape-predictor which will probably only work 
on the simplest of cases and is not guaranteed to bring wins that 
would justify the time it'll take to implement it.


TL;DR

The change to optimize certain groups of templates will require 
MASSIVE amounts of work and is unlikely to pay for itself.


Re: Compiler scalability. Question inspired by OOM errors seen by Crystal language

2017-08-31 Thread H. S. Teoh via Digitalmars-d
On Thu, Aug 31, 2017 at 04:36:41PM +, Stefan Koch via Digitalmars-d wrote:
> On Wednesday, 30 August 2017 at 16:34:13 UTC, H. S. Teoh wrote:
> > 
> > The CTFE problem will be fixed soon, once Stefan Koch finishes his
> > newCTFE engine.
> > 
> > Templates are still an area where a lot of improvement can be made.
> > But Stefan has indicated interest in visiting this area in the
> > compiler after the newCTFE project is done.  So eventually we'll get
> > there.
> > 
> > 
> > T
> 
> Yes that is true.
> However since I've just taken a full-time job I cannot say when it
> will be finished.
> 
> As for templates I have chosen to work around the issue entirely and
> integrate type-manipulation and introspection abilities with ctfe.
> (Which is inherently more scalable then templates)

I think D's conception of templates, which IMO goes beyond C++'s
(conceptually, anyway, the current implementation is not too different
fundamentally), can be implemented in a much better way that makes it
more scalable.

C++ sees templates as code stencils to be instantiated with the given
template arguments, i.e., copy-n-paste the stencil and substitute
template parameters with the given arguments.

D's templates for the most part still retains that, and certainly the
current implementation essentially just follows the C++ model. But IMO
there's an important conceptual difference: D sees template parameters
not so much as parameters for code stencils to be copy-n-pasted, but as
*compile-time* parameters to a function / set of declarations.  I.e.,
they are parameters that are known at compile-time rather than runtime.

The distinction is subtle but IMO important, because it allows us to
break out of the C++ mold of copy-n-paste-this-stencil that is one of
the causes of template scalability problems (pass too many different
template arguments, and you get massive code bloat, one copy per
instantiation).  By treating template arguments as arguments known at
compile-time instead, the compiler can get smarter with when to
copy-n-paste a declaration, and when to just evaluate the template with
the argument without actually instantiating anything (no copying of AST
nodes, etc.).

A common example is a templated linked-list container, where much of the
linked-list pointer manipulation code is actually independent of the
type of the contained data.  The compiler really only needs to generate
distinct copies of the code when the data itself is being manipulated;
all the other code can be shared between template instantiations.

This is just one example off the top of my head; I'm sure there are
plenty of others that we can come up with once we break free of the C++
mold of thinking of templates as "copy-n-paste except substitute X with
Y".  Another example that just occurred to me is the various recursive
templates in std.meta for manipulating AliasSeq's.  There is actually no
need for the compiler to instantiate anything at all, except perhaps for
the final result. By treating the AliasSeq as an in-compiler data type
with compile-time operations performed on it, the compiler ought to be
able to evaluate the template without needing to instantiate anything
past the first level of recursion.


T

-- 
"How are you doing?" "Doing what?"


Re: Compiler scalability. Question inspired by OOM errors seen by Crystal language

2017-08-31 Thread Stefan Koch via Digitalmars-d

On Wednesday, 30 August 2017 at 16:34:13 UTC, H. S. Teoh wrote:


The CTFE problem will be fixed soon, once Stefan Koch finishes 
his newCTFE engine.


Templates are still an area where a lot of improvement can be 
made. But Stefan has indicated interest in visiting this area 
in the compiler after the newCTFE project is done.  So 
eventually we'll get there.



T


Yes that is true.
However since I've just taken a full-time job I cannot say when 
it will be finished.


As for templates I have chosen to work around the issue entirely 
and integrate type-manipulation and introspection abilities with 
ctfe. (Which is inherently more scalable then templates)


Re: Compiler scalability. Question inspired by OOM errors seen by Crystal language

2017-08-30 Thread via Digitalmars-d

On Wednesday, 30 August 2017 at 16:39:32 UTC, H. S. Teoh wrote:
On Wed, Aug 30, 2017 at 08:01:17AM +, via Digitalmars-d 
wrote: [...]
D supports separate compilation by design. I.e. it doesn't 
require all the source files corresponding to all the object 
files being linked to produce the final executable, to be 
loaded in memory by the compiler.


Yes, I think the general recommendation for very large 
codebases is to compile one package at a time, i.e., one subdir 
at a time.


Yep, that avoids parsing the same file over and over again (like 
would be the case in C++ with commonly used header files for each 
translation unit), yet provides enough granularity for 
parallelism and also somewhat limits the memory problem.


Nevertheless, compiler memory usage remains an issue on 
low-memory systems.  My old work PC simply cannot handle 
running dmd without grinding to a halt because there's just not 
enough RAM to go around (only 500MB total).  It's generally not 
a problem for modern PCs with at least a few GB of memory.  
Walter chose compilation speed over memory efficiency, so 
that's just the way it is.  In theory, one could turn on GC in 
the compiler so that it will work on low-memory systems, but


(Nods)


I'm not sure if such a change will be accepted into dmd.


T


We could always make that a compiler switch. I'm not sure if a 
substantial benefit could be gained during parsing and semantic 
analysis (e.g. consider staticIota - each instantiation needs to 
be kept in memory, since you never know when it will be needed 
again), but certainly once you convert the AST to the backend IR, 
you could immediately drop all memory allocated by the frontend.


Re: Compiler scalability. Question inspired by OOM errors seen by Crystal language

2017-08-30 Thread H. S. Teoh via Digitalmars-d
On Wed, Aug 30, 2017 at 08:01:17AM +, via Digitalmars-d wrote:
[...]
> D supports separate compilation by design. I.e. it doesn't require all
> the source files corresponding to all the object files being linked to
> produce the final executable, to be loaded in memory by the compiler.

Yes, I think the general recommendation for very large codebases is to
compile one package at a time, i.e., one subdir at a time.

Nevertheless, compiler memory usage remains an issue on low-memory
systems.  My old work PC simply cannot handle running dmd without
grinding to a halt because there's just not enough RAM to go around
(only 500MB total).  It's generally not a problem for modern PCs with at
least a few GB of memory.  Walter chose compilation speed over memory
efficiency, so that's just the way it is.  In theory, one could turn on
GC in the compiler so that it will work on low-memory systems, but I'm
not sure if such a change will be accepted into dmd.


T

-- 
Questions are the beginning of intelligence, but the fear of God is the 
beginning of wisdom.


Re: Compiler scalability. Question inspired by OOM errors seen by Crystal language

2017-08-30 Thread H. S. Teoh via Digitalmars-d
On Wed, Aug 30, 2017 at 02:53:42AM +, Nicholas Wilson via Digitalmars-d 
wrote:
[...]
> The problem with D is the memory hogging nature of CTFE and the sheer
> number of templates that get instantiated when compiling big
> codebases. Symbol length is also a problem but that eats you dose
> space not your RAM.

The symbol length problem has been greatly reduced by Rainer's recent
mangle PR, which is now in git master and will be in the following
release. There is still room for even more reduction but they will be
comparatively minor. Rainer's fix has eliminated the bulk of the symbol
length problem.

The CTFE problem will be fixed soon, once Stefan Koch finishes his
newCTFE engine.

Templates are still an area where a lot of improvement can be made. But
Stefan has indicated interest in visiting this area in the compiler
after the newCTFE project is done.  So eventually we'll get there.


T

-- 
Famous last words: I *think* this will work...


Re: Compiler scalability. Question inspired by OOM errors seen by Crystal language

2017-08-30 Thread via Digitalmars-d

On Wednesday, 30 August 2017 at 01:30:30 UTC, Pradeep Gowda wrote:
I'm referring to this thread about Crystal --  
https://lobste.rs/s/dyitr0/its_fun_program_crystal_is_it_scalable


Crystal is strongly typed, but overwhelmingly uses type 
inference, rather than explicit types. Because Crystal aims to 
be spiritually—and frequently literally—compatible with Ruby, 
that’s a problem: to accomplish that, Crystal relies on 
sometimes-nullable types with implicit structure and implicit 
unions, such that, frequently, the only way to even begin type 
inference is to load the entire program’s AST into RAM all at 
once and then start your massive type inference pass. What 
you’re seeing in this thread is how a “simple” fix to a YAML 
parser error reporting hit that problem, causing Crystal to 
use a critical amount too much RAM and OOM.


How does D compare in this regard, especially in cases where 
`auto` storage class specifiers are used liberally throughout 
the code base?


D supports separate compilation by design. I.e. it doesn't 
require all the source files corresponding to all the object 
files being linked to produce the final executable, to be loaded 
in memory by the compiler.


Re: Compiler scalability. Question inspired by OOM errors seen by Crystal language

2017-08-30 Thread via Digitalmars-d
On Wednesday, 30 August 2017 at 02:53:42 UTC, Nicholas Wilson 
wrote:
On Wednesday, 30 August 2017 at 01:30:30 UTC, Pradeep Gowda 
wrote:
I'm referring to this thread about Crystal --  
https://lobste.rs/s/dyitr0/its_fun_program_crystal_is_it_scalable


Crystal is strongly typed, but overwhelmingly uses type 
inference, rather than explicit types. Because Crystal aims 
to be spiritually—and frequently literally—compatible with 
Ruby, that’s a problem: to accomplish that, Crystal relies on 
sometimes-nullable types with implicit structure and implicit 
unions, such that, frequently, the only way to even begin 
type inference is to load the entire program’s AST into RAM 
all at once and then start your massive type inference pass. 
What you’re seeing in this thread is how a “simple” fix to a 
YAML parser error reporting hit that problem, causing Crystal 
to use a critical amount too much RAM and OOM.


How does D compare in this regard, especially in cases where 
`auto` storage class specifiers are used liberally throughout 
the code base?


Auto is not the problem you can easily figure out the return 
type of function that return a primitive type, and aggregates 
have to be specified.


The problem with D is the memory hogging nature of CTFE and the 
sheer number of templates that get instantiated when compiling 
big codebases. Symbol length is also a problem but that eats 
you dose space not your RAM.


Symbols do eat your RAM because, the compiler has to generate 
them in RAM before writing them to a binary, and also don't 
forget `pragma (msg, symbol.mangle)` - the mangled name of each 
symbol is available for use in CTFE. However the mangling 
problems should finally be put to rest, thanks to Rainer 
Schuetze, now that
https://github.com/dlang/dmd/pull/5855 / 
https://github.com/dlang/dmd/pull/6998 was merged.


Re: Compiler scalability. Question inspired by OOM errors seen by Crystal language

2017-08-29 Thread Nicholas Wilson via Digitalmars-d

On Wednesday, 30 August 2017 at 01:30:30 UTC, Pradeep Gowda wrote:
I'm referring to this thread about Crystal --  
https://lobste.rs/s/dyitr0/its_fun_program_crystal_is_it_scalable


Crystal is strongly typed, but overwhelmingly uses type 
inference, rather than explicit types. Because Crystal aims to 
be spiritually—and frequently literally—compatible with Ruby, 
that’s a problem: to accomplish that, Crystal relies on 
sometimes-nullable types with implicit structure and implicit 
unions, such that, frequently, the only way to even begin type 
inference is to load the entire program’s AST into RAM all at 
once and then start your massive type inference pass. What 
you’re seeing in this thread is how a “simple” fix to a YAML 
parser error reporting hit that problem, causing Crystal to 
use a critical amount too much RAM and OOM.


How does D compare in this regard, especially in cases where 
`auto` storage class specifiers are used liberally throughout 
the code base?


Auto is not the problem you can easily figure out the return type 
of function that return a primitive type, and aggregates have to 
be specified.


The problem with D is the memory hogging nature of CTFE and the 
sheer number of templates that get instantiated when compiling 
big codebases. Symbol length is also a problem but that eats you 
dose space not your RAM.


Compiler scalability. Question inspired by OOM errors seen by Crystal language

2017-08-29 Thread Pradeep Gowda via Digitalmars-d
I'm referring to this thread about Crystal --  
https://lobste.rs/s/dyitr0/its_fun_program_crystal_is_it_scalable


Crystal is strongly typed, but overwhelmingly uses type 
inference, rather than explicit types. Because Crystal aims to 
be spiritually—and frequently literally—compatible with Ruby, 
that’s a problem: to accomplish that, Crystal relies on 
sometimes-nullable types with implicit structure and implicit 
unions, such that, frequently, the only way to even begin type 
inference is to load the entire program’s AST into RAM all at 
once and then start your massive type inference pass. What 
you’re seeing in this thread is how a “simple” fix to a YAML 
parser error reporting hit that problem, causing Crystal to use 
a critical amount too much RAM and OOM.


How does D compare in this regard, especially in cases where 
`auto` storage class specifiers are used liberally throughout the 
code base?