Re: Linus with some good observations on garbage collection

2011-04-22 Thread Andrej Mitrovic
Just a reminder: that post is 9 years old.


Re: Linus with some good observations on garbage collection

2011-04-22 Thread Alvaro

El 22/04/2011 19:36, Walter Bright escribió:

http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html


I've always been surprised when discussions usually just bring garbage 
collection as the only alternative to explicit manual memory management. 
I imagined it as a garbage truck that has its own schedule and may let a 
lot of trash pile up before passing by. I always naively thought, why 
not just free immediately when an object gets no references?


Not an expert, so there may be reasons I don't see, but now that Linus 
says somethnig along the lines, I'll ask. Why not? Isn't it much easier 
to do refcount++ and refcount--, and if refcount==0 immediately 
"free()"? Memory will be available to other needs faster, no need for an 
additional thread, or a lot of memory consumed before the advanced 
garbage truck decides to come in, or slight pauses when collecting trash 
(maybe only in old implementations), and the implementation is much 
simpler...


OK, I knew about that "cyclic references" problem. But Linus doesn't 
seem to see a big problem and solutions can be found with care...


Re: Linus with some good observations on garbage collection

2011-04-22 Thread Michael Stover
This sort of reference count with cyclic dependency detector is how a lot of
scripting languages do it, or did it in the past.  The problem was that lazy
generational GCs are believed to have better throughput in general.

I'd like to say "were proved" rather than "are believed", but I don't
actually know where to go for such evidence.  However, I do believe many
scripting languages, such as python, eventually ditched the reference
counting technique for generational, and Java has very fast GC, so I am
inclined to believe those real-life solutions than Linus.

Mike

On Fri, Apr 22, 2011 at 2:32 PM, Alvaro  wrote:

> El 22/04/2011 19:36, Walter Bright escribió:
>
>  http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html
>>
>
> I've always been surprised when discussions usually just bring garbage
> collection as the only alternative to explicit manual memory management. I
> imagined it as a garbage truck that has its own schedule and may let a lot
> of trash pile up before passing by. I always naively thought, why not just
> free immediately when an object gets no references?
>
> Not an expert, so there may be reasons I don't see, but now that Linus says
> somethnig along the lines, I'll ask. Why not? Isn't it much easier to do
> refcount++ and refcount--, and if refcount==0 immediately "free()"? Memory
> will be available to other needs faster, no need for an additional thread,
> or a lot of memory consumed before the advanced garbage truck decides to
> come in, or slight pauses when collecting trash (maybe only in old
> implementations), and the implementation is much simpler...
>
> OK, I knew about that "cyclic references" problem. But Linus doesn't seem
> to see a big problem and solutions can be found with care...
>


Re: Linus with some good observations on garbage collection

2011-04-22 Thread Steven Schveighoffer

On Fri, 22 Apr 2011 14:32:06 -0400, Alvaro  wrote:


El 22/04/2011 19:36, Walter Bright escribió:

http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html


I've always been surprised when discussions usually just bring garbage  
collection as the only alternative to explicit manual memory management.  
I imagined it as a garbage truck that has its own schedule and may let a  
lot of trash pile up before passing by. I always naively thought, why  
not just free immediately when an object gets no references?


Because you then have to update potentially two reference counts every  
time you assign a pointer.  GC's save you from doing that.


I know way way less than Torvalds, but my naive brain says GC's still win  
because often times, slightly noticeable drops in performance are worth  
having code that doesn't corrupt memory.  This may not be true for kernel  
development, but then again, we aren't all developing kernels ;)


-Steve


Re: Linus with some good observations on garbage collection

2011-04-22 Thread Brad Roberts
Also add to it that in many cases you're dealing with a threaded environment, 
so those refcounts have to be locked
(either via mutexes, or more commonly just atomic) operations which are far 
more expensive than non-atomic.  More so
when there's actual contention for the refcounted resource.

On 4/22/2011 11:53 AM, Michael Stover wrote:
> This sort of reference count with cyclic dependency detector is how a lot of 
> scripting languages do it, or did it in the
> past.  The problem was that lazy generational GCs are believed to have better 
> throughput in general.
> 
> I'd like to say "were proved" rather than "are believed", but I don't 
> actually know where to go for such evidence.
>  However, I do believe many scripting languages, such as python, eventually 
> ditched the reference counting technique for
> generational, and Java has very fast GC, so I am inclined to believe those 
> real-life solutions than Linus.
> 
> Mike
> 
> On Fri, Apr 22, 2011 at 2:32 PM, Alvaro  > wrote:
> 
> El 22/04/2011 19:36, Walter Bright escribió:
> 
> http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html
> 
> 
> I've always been surprised when discussions usually just bring garbage 
> collection as the only alternative to
> explicit manual memory management. I imagined it as a garbage truck that 
> has its own schedule and may let a lot of
> trash pile up before passing by. I always naively thought, why not just 
> free immediately when an object gets no
> references?
> 
> Not an expert, so there may be reasons I don't see, but now that Linus 
> says somethnig along the lines, I'll ask. Why
> not? Isn't it much easier to do refcount++ and refcount--, and if 
> refcount==0 immediately "free()"? Memory will be
> available to other needs faster, no need for an additional thread, or a 
> lot of memory consumed before the advanced
> garbage truck decides to come in, or slight pauses when collecting trash 
> (maybe only in old implementations), and
> the implementation is much simpler...
> 
> OK, I knew about that "cyclic references" problem. But Linus doesn't seem 
> to see a big problem and solutions can be
> found with care...
> 
> 



Re: Linus with some good observations on garbage collection

2011-04-22 Thread Timon Gehr
Brad Roberts wrote:
> Also add to it that in many cases you're dealing with a threaded environment, 
> >
so those refcounts have to be locked
> (either via mutexes, or more commonly just atomic) operations which are far 
> more
expensive than non-atomic.  More so
> when there's actual contention for the refcounted resource.

That is only a problem if the reference count of that resource changes at a very
high frequency. The described problem also implies that the threads would not 
need
any form of synchronization for the data (otherwise the reference count 
certainly
would not be a bottleneck.)

I cannot, at the moment, think of a real-world example where this would not 
imply
bad design. Can you help me out?

Michael Stover wrote:
> I'd like to say "were proved" rather than "are believed", but I don't actually
know where to go for such evidence.
>  However, I do believe many scripting languages, such as python, eventually
ditched the reference counting technique for
> generational, and Java has very fast GC, so I am inclined to believe those
real-life solutions than Linus.
>
> Mike

Well, the GC may be fast when compared to other GCs, but it has to be designed 
to
run on general data whose reference structure can be arbitrary. Often, the
objects/references have a somewhat specialized structure that a smart programmer
can exploit, especially if the references point to private data. But that only
matters if performance is of prime interest, and the gains may be not very big.

But, as pointed out by Linus, the prime performance problem is _not_ the GC, but
the mindset that comes with it. Most programmers that "grew up" in a managed
environment tend to use very many "new" keywords all over their code, instead of
allocating large chunks of memory at once. (Java/C#/etc encourage you to do 
this.)
When they then try to write a C++ program, they do the same. The resulting 
memory
bugs are then blamed on the lack of a GC (you can have GC in C/C++, but most of
the people I have talked to do not know this.) They then happily change back to
Java, that has a "very fast GC".

The important thing to note here is that the work required to deallocate all 
these
many memory locations does not magically disappear, but it is delegated to the 
GC,
which will mostly do it faster and more reliable than a programmer which has to 
do
it manually. But the problem does not lie in the deallocations, it's in the
allocations.

Consider this analogy:

Scenario 1: Many people like candy. Those are wrapped in colorful little pieces 
of
paper. Every day, every person buys one piece of candy in the candy shop (new
Candy()!) and on the way back home they throw away the wrapping paper somewhere 
on
the street (reassign reference). Those are garbage. In the evening, some creepy
guy comes to search the whole street for those small pieces of paper. Would you
call that guy a garbage collector? He collects garbage, but in the real world
garbage collectors work more like this:

Scenario 2: Still, many people like candy. Every person buys a bag of candy in 
the
candy shop once a year (new Candy[365]). When all the candy is eaten, they put 
all
the garbage in one bag and put it to their front door (reassign reference to 
whole
array). A very handsome guy collects all those bags. He is very much more
efficient than the guy in example 1. (Arguably, memory usage is bigger in that
particular example, but in computer programs, the allocating process can reuse 
the
memory. The analogy breaks down here.)

Note that I am not saying that garbage collection is bad. If the references 
form a
very complicated structure, or if a reference to the containing object is not
necessarily kept (iE. array slicing), it can be very useful as well as faster 
than
manual memory management. Also, custom allocators can reduce the
lots-of-small-allocations problem, but they have some overhead too. Advanced GCs
may do this automatically, I don't know.

The reason Java is garbage collected is _not_ performance, but primarily
reliability. In big programming projects, it can be hard to know where a 
reference
to an object may be kept, if there are circular references etc, as programs keep
expanding without the programmers understanding the whole project in detail. GC
also removes bugs related to memory management. I think true and reliable OO is
not possible without a GC. The downside is, that many Java/... programmers don't
concern themselves much with the internals of memory allocations, and are _very_
proud of it. This is also the reason I think it is a bad idea to deprecate D's
'delete'.


-Timon


Re: Linus with some good observations on garbage collection

2011-04-22 Thread bearophile
Timon Gehr:

> But, as pointed out by Linus, the prime performance problem is _not_ the GC, 
> but
> the mindset that comes with it. Most programmers that "grew up" in a managed
> environment tend to use very many "new" keywords all over their code, instead 
> of
> allocating large chunks of memory at once. (Java/C#/etc encourage you to do 
> this.)

In C99 (and Ada) you avoid the allocation of some dynamic arrays with new 
thanks to variable length arrays.


> This is also the reason I think it is a bad idea to deprecate D's 'delete'.

D used to have scoped class instances, scoped classes, and delete, their 
replacements are not good enough yet. In CommonLisp you have hints for the GC, 
they are safe and they help you help speedup the work of the GC. Such hints 
probably need to be integrated with the type system, so they may need to be 
built-ins as scope/delete were. I am not seeing enough discussion about this.

Bye,
bearophile


Re: Linus with some good observations on garbage collection

2011-04-22 Thread Iain Buclaw
== Quote from bearophile (bearophileh...@lycos.com)'s article
> Timon Gehr:
> > But, as pointed out by Linus, the prime performance problem is _not_ the 
> > GC, but
> > the mindset that comes with it. Most programmers that "grew up" in a managed
> > environment tend to use very many "new" keywords all over their code, 
> > instead of
> > allocating large chunks of memory at once. (Java/C#/etc encourage you to do 
> > this.)
> In C99 (and Ada) you avoid the allocation of some dynamic arrays with new 
> thanks
to variable length arrays.

Variable length arrays are just sugary syntax for a call to alloca.

> > This is also the reason I think it is a bad idea to deprecate D's 'delete'.
> D used to have scoped class instances, scoped classes, and delete, their
replacements are not good enough yet. In CommonLisp you have hints for the GC,
they are safe and they help you help speedup the work of the GC. Such hints
probably need to be integrated with the type system, so they may need to be
built-ins as scope/delete were. I am not seeing enough discussion about this.
> Bye,
> bearophile


I've always felt that Vala's system is better thought out, which is incidentally
based on a reference counting system. This makes destructors in Vala 
deterministic
and can be used to implement an RAII pattern for resource management.
To get around the common pitfalls of reference counting systems, they introduce
two keywords which alter the relationship between the allocated object and the 
GC,
'weak' and 'unowned'.

Rather than bore you with the gritty details here, see link:
http://live.gnome.org/Vala/ReferenceHandling


Re: Linus with some good observations on garbage collection

2011-04-22 Thread bearophile
Iain Buclaw:

> Variable length arrays are just sugary syntax for a call to alloca.

I have an enhancement request in Bugzilla on VLA, with longer discussions. Just 
two comments:
- It seems alloca() can be implemented with two different semantics: to 
deallocate at the end of the function or to deallocate at the end of the scope. 
Usually alloca() deallocates at the end of the function, but that semantic 
confusion is dangerous. VLA deallocate at the end of the scope, just like any 
other static array.
- To use alloca you need to use pointers, maybe even slices, it's not DRY, etc. 
So syntax sugar helps.

In the meantime I've changed my mind a little. Now D I prefer something better 
than C99 VLAs. I'd like D-VLAs with the same syntax as C99 VLAs but with a 
safer semantics, closer to this one (but the "alloca" used here must deallocate 
at the end of the scope):

enum size_t MAX_VLA_SIZE = 1024;
static assert (is(typeof(size) == size_t));
T* ptr = null;
if ((size * T.sizeof) < MAX_VLA_SIZE)
ptr = cast(T*)alloca(size * T.sizeof);
T[] array = (ptr == null) ? new T[size] : ptr[0 .. size];
array[] = T.init;

This has some advantages: when alloca returns null, or when the array is large, 
it uses the GC. This allows to both avoid some stack overflows and reduce the 
risk of the stack memory from becoming too much cold.


> Rather than bore you with the gritty details here, see link:
> http://live.gnome.org/Vala/ReferenceHandling

This is interesting.

Bye,
bearophile


Re: Linus with some good observations on garbage collection

2011-04-22 Thread bearophile
> Now D I prefer something better than C99 VLAs. I'd like D-VLAs with the same 
> syntax as C99 VLAs but with a safer semantics,

Never mind, that semantics can't use that syntax, otherwise you have hidden 
heap allocations... The syntax has to change (because the D-VLA semantics seems 
OK to me).

Bye,
bearophile


Re: Linus with some good observations on garbage collection

2011-04-22 Thread Ulrik Mikaelsson
2011/4/22 Timon Gehr :
> That is only a problem if the reference count of that resource changes at a 
> very
> high frequency. The described problem also implies that the threads would not 
> need
> any form of synchronization for the data (otherwise the reference count 
> certainly
> would not be a bottleneck.)

> Michael Stover wrote:
>> I'd like to say "were proved" rather than "are believed", but I don't 
>> actually
> know where to go for such evidence.
>>  However, I do believe many scripting languages, such as python, eventually
> ditched the reference counting technique for
>> generational, and Java has very fast GC, so I am inclined to believe those
> real-life solutions than Linus.
>
> Well, the GC may be fast when compared to other GCs, but it has to be 
> designed to
> run on general data whose reference structure can be arbitrary. Often, the
> objects/references have a somewhat specialized structure that a smart 
> programmer
> can exploit, especially if the references point to private data. But that only
> matters if performance is of prime interest, and the gains may be not very 
> big.

All in all, I think the best approach is a pragmatic one, where
different types of resources can be handled according to different
schemes.

I.E. default to GC-manage everything. After profiling, determining
what resources are mostly used, and where, optimize allocation for
those resources, preferably to scoped allocation, or if not possible,
reference-counted.

Premature optimization is a root of much evil, for instance, the
malloc-paranoid might very well resort to abuse of struct:s, leading
either to lots of manual pointers, or excessive memory copying.

Incidentally, this was the main thing that attracted me to D. Be
lazy/productive where performance doesn't matter much, and focus
optimization on where it does.


Re: Linus with some good observations on garbage collection

2011-04-22 Thread Timon Gehr
Ulrik Mikaelsson wrote:
> All in all, I think the best approach is a pragmatic one, where
> different types of resources can be handled according to different
> schemes.
>
> I.E. default to GC-manage everything. After profiling, determining
> what resources are mostly used, and where, optimize allocation for
> those resources, preferably to scoped allocation, or if not possible,
> reference-counted.
>
> Premature optimization is a root of much evil, for instance, the
> malloc-paranoid might very well resort to abuse of struct:s, leading
> either to lots of manual pointers, or excessive memory copying.
>
> Incidentally, this was the main thing that attracted me to D. Be
> lazy/productive where performance doesn't matter much, and focus
> optimization on where it does.

That is very true. GC is almost always fast enough or even faster. And it is
clearly most convenient. And yes, identify bottlenecks first, optimize later. 
But
I also think programs that have some concern about efficient memory allocation
(with GC or without GC) tend to be better designed in general. This actually
increases productivity. Plus, it reduces the need for complicated optimizations
later on. This in order increases maintainability.

-Timon


Re: Linus with some good observations on garbage collection

2011-04-23 Thread Jérôme M. Berger
Michael Stover wrote:
> This sort of reference count with cyclic dependency detector is how a
> lot of scripting languages do it, or did it in the past.  The problem
> was that lazy generational GCs are believed to have better throughput in
> general.
>
> I'd like to say "were proved" rather than "are believed", but I don't
> actually know where to go for such evidence.  However, I do believe many
> scripting languages, such as python, eventually ditched the reference
> counting technique for generational, and Java has very fast GC, so I am
> inclined to believe those real-life solutions than Linus.
>
This is a single benchmark on a single obscure language, so
take it with a grain of salt, but look at the "Nimrod" and "Nimrod
Boehm" lines here: https://bitbucket.org/jmb/shootout/wiki/Home

By default, Nimrod uses a reference counter but there is a
compiler switch to use the Boehm GC instead. Of particular interest
is the benchmark "Binary trees", since it is specially designed to
exercise memory management. On that benchmark, the Boehm version is
about twice as fast as the refcounting version, while on other
benchmarks, performance is comparable.

Jerome
-- 
mailto:jeber...@free.fr
http://jeberger.free.fr
Jabber: jeber...@jabber.fr





signature.asc
Description: OpenPGP digital signature


Re: Linus with some good observations on garbage collection

2011-04-23 Thread Iain Buclaw
== Quote from bearophile (bearophileh...@lycos.com)'s article
> Iain Buclaw:
> > Variable length arrays are just sugary syntax for a call to alloca.
> I have an enhancement request in Bugzilla on VLA, with longer discussions. 
> Just
two comments:
> - It seems alloca() can be implemented with two different semantics: to
deallocate at the end of the function or to deallocate at the end of the scope.
Usually alloca() deallocates at the end of the function, but that semantic
confusion is dangerous. VLA deallocate at the end of the scope, just like any
other static array.

Yep, my understanding of the way C99 goes about it (I might be wrong here, so
don't hurt me :), the stack is saved for the lifetime of the alloca'd array, and
restored once it goes out of scope.

ie:

int sz = 42;
{
  void * saved_stack = __stack_save (); // start of scope
  int array[0 : D.range];
  uint D.range;  // automatic variable
  void * D.ptr;  // automatic variable

  D.range = sz + -1;
  D.ptr = __builtin_alloca (sz * 4);
  array = (int[0 : D.range] *) D.ptr;

  // later...

  __stack_restore (saved_stack);// end of scope
}


Re: Linus with some good observations on garbage collection

2011-04-23 Thread Andrei Alexandrescu

On 4/22/11 4:04 PM, Timon Gehr wrote:
[snip]

This is also the reason I think it is a bad idea to deprecate D's
'delete'.


The functionality is not going away.

Andrei


Re: Linus with some good observations on garbage collection

2011-04-23 Thread Andrei Alexandrescu

On 4/22/11 5:21 PM, Iain Buclaw wrote:

== Quote from bearophile (bearophileh...@lycos.com)'s article

Timon Gehr:

But, as pointed out by Linus, the prime performance problem is _not_ the GC, but
the mindset that comes with it. Most programmers that "grew up" in a managed
environment tend to use very many "new" keywords all over their code, instead of
allocating large chunks of memory at once. (Java/C#/etc encourage you to do 
this.)

In C99 (and Ada) you avoid the allocation of some dynamic arrays with new thanks

to variable length arrays.

Variable length arrays are just sugary syntax for a call to alloca.


Well in fairness they can be a fair amount more elaborate. The trouble 
with alloca is that it's impossible to compose with. So in general you 
need to write something like:


bool onStack = smallEnough(length * sizeof(T));
auto a = (cast(T*) (onStack ? alloca : malloc)(length * sizeof(T)));
scope(exit) if (!onStack) free(a.ptr);
initialize(enforce(a));
scope(exit) clear(a);

This block is difficult to factor out because of alloca.


Andrei


Re: Linus with some good observations on garbage collection

2011-04-23 Thread dsimcha

On 4/23/2011 9:48 AM, Andrei Alexandrescu wrote:

On 4/22/11 5:21 PM, Iain Buclaw wrote:

== Quote from bearophile (bearophileh...@lycos.com)'s article

Timon Gehr:

But, as pointed out by Linus, the prime performance problem is _not_
the GC, but
the mindset that comes with it. Most programmers that "grew up" in a
managed
environment tend to use very many "new" keywords all over their
code, instead of
allocating large chunks of memory at once. (Java/C#/etc encourage
you to do this.)

In C99 (and Ada) you avoid the allocation of some dynamic arrays with
new thanks

to variable length arrays.

Variable length arrays are just sugary syntax for a call to alloca.


Well in fairness they can be a fair amount more elaborate. The trouble
with alloca is that it's impossible to compose with. So in general you
need to write something like:

bool onStack = smallEnough(length * sizeof(T));
auto a = (cast(T*) (onStack ? alloca : malloc)(length * sizeof(T)));
scope(exit) if (!onStack) free(a.ptr);
initialize(enforce(a));
scope(exit) clear(a);

This block is difficult to factor out because of alloca.


Andrei


Right.  This is exactly the kind of thing TempAlloc was meant to solve. 
 (Actually, to give credit, the initial rough idea for TempAlloc was 
proposed by Andrei, though I fleshed out the details and implemented 
it.)  With TempAlloc, you have a segmented stack (i.e. one that 
allocates a new segment instead of crashing the program if it's full) 
that is independent of the call stack (so you can return 
TempAlloc-allocated memory from functions).


BTW, since when does the ternary operator work with functions, as 
opposed to variables?


Re: Linus with some good observations on garbage collection

2011-04-23 Thread bearophile
Andrei:

> Well in fairness they can be a fair amount more elaborate. The trouble 
> with alloca is that it's impossible to compose with. So in general you 
> need to write something like:
> 
> bool onStack = smallEnough(length * sizeof(T));
> auto a = (cast(T*) (onStack ? alloca : malloc)(length * sizeof(T)));
> scope(exit) if (!onStack) free(a.ptr);
> initialize(enforce(a));
> scope(exit) clear(a);
> 
> This block is difficult to factor out because of alloca.

See this too:
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=135208

Keep in mind that alloca() returns null more often than malloc, and when 
alloca() return null, it's still meaningful to try to use malloc(). So the 
semantics you have written is not good enough yet (as mine. A merge of the two 
seems better).

My idea of D-style VLAs is something built-in that encapsulates that semantics 
(that's more complex than C99 VLA semantics).

I am suggesting a built-in because there is some more semantics to keep into 
account for a good implementation:
- Probably you want the arrays created on the stack to have an immutable length 
(otherwise you have to switch to a normal fully GC-heap dynamic array, but I 
don't like this).
- You have to return such arrays from functions, and to give them to function 
arguments. As fixed-sized arrays they probably need to be managed by value 
(unless you use "ref"), so they are different from dynamic arrays, and closer 
to fixed-sized ones.

Bye,
bearophile


Re: Linus with some good observations on garbage collection

2011-04-23 Thread bearophile
dsimcha:

> Right.  This is exactly the kind of thing TempAlloc was meant to solve. 

I think TempAlloc is meant for larger allocations. D-VLAs are meant to allocate 
little arrays (like 1 KB or less) on the normal stack very quickly.

Bye,
bearophile


Re: Linus with some good observations on garbage collection

2011-04-23 Thread bearophile
Andrei:

> bool onStack = smallEnough(length * sizeof(T));
> auto a = (cast(T*) (onStack ? alloca : malloc)(length * sizeof(T)));
> scope(exit) if (!onStack) free(a.ptr);
> initialize(enforce(a));
> scope(exit) clear(a);
> 
> This block is difficult to factor out because of alloca.

As I have explained in my old enhancement request about VLAs and in a recent 
post, this code is dangerous, because alloca() in some implementations 
deallocates at the end of the function, while in other cases it deallocates at 
the end of the scope. A well implemented D-VLA needs be sure to free the stack 
space at the end of the scope.

Bye,
bearophile


Re: Linus with some good observations on garbage collection

2011-04-23 Thread dsimcha

On 4/23/2011 10:24 AM, bearophile wrote:

dsimcha:


Right.  This is exactly the kind of thing TempAlloc was meant to solve.


I think TempAlloc is meant for larger allocations. D-VLAs are meant to allocate 
little arrays (like 1 KB or less) on the normal stack very quickly.

Bye,
bearophile


Why would TempAlloc not work for this?  It's slightly slower than VLAs, 
but the difference is negligible since it's still almost never a 
threading bottleneck and you still have to do something with the array 
you allocated.


Re: Linus with some good observations on garbage collection

2011-04-23 Thread Andrei Alexandrescu

On 4/23/11 8:57 AM, dsimcha wrote:

BTW, since when does the ternary operator work with functions, as
opposed to variables?


They are converted to pointers to functions. Not something I'd recommend 
because it makes the call slower.


Andrei


Re: Linus with some good observations on garbage collection

2011-04-23 Thread Timon Gehr
> On 4/22/11 4:04 PM, Timon Gehr wrote:
> [snip]
>> This is also the reason I think it is a bad idea to deprecate D's
>> 'delete'.
>
> The functionality is not going away.
>
> Andrei

Yes I realize that. It is a matter of syntax and aesthetics. Handling allocation
and deallocation non-uniformly renders the language ugly (or are you considering
to remove the 'new' keyword too?). And it discourages people to get informed 
about
custom allocators etc. I mean, how will custom allocators look like? The cost of
having the delete keyword in the language is low. Before removing that, why not
consider the 'body' keyword instead?

Timon


Re: Linus with some good observations on garbage collection

2011-04-23 Thread Andrei Alexandrescu

On 4/23/11 11:33 AM, Timon Gehr wrote:

On 4/22/11 4:04 PM, Timon Gehr wrote:
[snip]

This is also the reason I think it is a bad idea to deprecate D's
'delete'.


The functionality is not going away.

Andrei


Yes I realize that. It is a matter of syntax and aesthetics. Handling allocation
and deallocation non-uniformly renders the language ugly (or are you considering
to remove the 'new' keyword too?).


Allocation and deallocation are not symmetric and as such handling them 
in a uniform way would be a mistake that perpetuates a poor 
understanding of the underlying realities of memory allocation and 
object creation. I suggest you reconsider.



And it discourages people to get informed about
custom allocators etc.


I don't see the relationship here.


I mean, how will custom allocators look like? The cost of
having the delete keyword in the language is low. Before removing that, why not
consider the 'body' keyword instead?


The cost of keeping "delete" in the language is making a rarely-used, 
dangerous facility with loosely-defined semantics straight inside the 
language. It reflects poor language design for many reasons.



Andrei



Re: Linus with some good observations on garbage collection

2011-04-23 Thread bearophile
dsimcha:

> It's slightly slower than VLAs, 
> but the difference is negligible since it's still almost never a 
> threading bottleneck and you still have to do something with the array 
> you allocated.

I have to compare the performance of both for my use cases (fast allocations of 
very small arrays, even a just a small number of chars).
I think an advantage of stack allocations is that most times the top of the 
current stack is already in cache L1, so it's warm memory.

Bye,
bearophile


Re: Linus with some good observations on garbage collection

2011-04-23 Thread Timon Gehr
Andrei Alexandrescu wrote:
> Allocation and deallocation are not symmetric and as such handling them
> in a uniform way would be a mistake that perpetuates a poor
> understanding of the underlying realities of memory allocation and
> object creation. I suggest you reconsider.

'char' is completely orthogonal to memory allocation and it still gets handled
uniformly to it syntactically. So I do not really get the point here.

>
>> And it discourages people to get informed about
>> custom allocators etc.
>
>I don't see the relationship here.

I cannot imagine that the syntax for them will be particularly nice after the
removal of 'delete'. Who likes features with unnecessarily clumsy syntax?

>The cost of keeping "delete" in the language is making a rarely-used,
>dangerous facility with loosely-defined semantics straight inside the
>language. It reflects poor language design for many reasons.
>
>Andrei

I think we are talking about different things. I do _not_ want to use the 
keyword
to somehow try to deallocate GC allocated memory. (who would actually want to do
this anyways?) I want it for custom allocators as in
http://www.digitalmars.com/d/2.0/memory.html. In that case, the semantics would 
be
completely specified by me. Removing the keyword means ruining that use case. 
You
can also find a nice, clean and strict semantic specification for delete when
called on GC memory that works on all platforms and with all GC implementations:
"do nothing, GC memory is managed automatically." As it is done now, I agree it 
is
not particularly nice.

Timon


Re: Linus with some good observations on garbage collection

2011-04-23 Thread Walter Bright

On 4/22/2011 2:04 PM, Timon Gehr wrote:

The reason Java is garbage collected is _not_ performance, but primarily
reliability.


I believe a GC is required if one wishes to prove memory safety, which is 
definitely a design goal of Java.


Re: Linus with some good observations on garbage collection

2011-04-23 Thread Piotr Szturmaj

Timon Gehr wrote:

Andrei Alexandrescu wrote:

Allocation and deallocation are not symmetric and as such handling them
in a uniform way would be a mistake that perpetuates a poor
understanding of the underlying realities of memory allocation and
object creation. I suggest you reconsider.


'char' is completely orthogonal to memory allocation and it still gets handled
uniformly to it syntactically. So I do not really get the point here.




And it discourages people to get informed about
custom allocators etc.


I don't see the relationship here.


I cannot imagine that the syntax for them will be particularly nice after the
removal of 'delete'. Who likes features with unnecessarily clumsy syntax?


The cost of keeping "delete" in the language is making a rarely-used,
dangerous facility with loosely-defined semantics straight inside the
language. It reflects poor language design for many reasons.

Andrei


I think we are talking about different things. I do _not_ want to use the 
keyword
to somehow try to deallocate GC allocated memory. (who would actually want to do
this anyways?) I want it for custom allocators as in
http://www.digitalmars.com/d/2.0/memory.html. In that case, the semantics would 
be
completely specified by me. Removing the keyword means ruining that use case. 
You
can also find a nice, clean and strict semantic specification for delete when
called on GC memory that works on all platforms and with all GC implementations:
"do nothing, GC memory is managed automatically." As it is done now, I agree it 
is
not particularly nice.

Timon


Why not allow "delete" only in classes with custom allocators? AFAIK 
they could be determined at compile time. Please let me know if I'm wrong.


Re: Linus with some good observations on garbage collection

2011-04-23 Thread Andrei Alexandrescu

On 4/23/11 1:05 PM, Timon Gehr wrote:

Andrei Alexandrescu wrote:

Allocation and deallocation are not symmetric and as such handling them
in a uniform way would be a mistake that perpetuates a poor
understanding of the underlying realities of memory allocation and
object creation. I suggest you reconsider.


'char' is completely orthogonal to memory allocation and it still gets handled
uniformly to it syntactically. So I do not really get the point here.


The point is that some naively believe "new" and "delete" are some 
simple do and undo actions and as such they somehow deserve the same 
level of support. In reality they have dramatically different issues and 
are profoundly asymmetric.



And it discourages people to get informed about
custom allocators etc.


I don't see the relationship here.


I cannot imagine that the syntax for them will be particularly nice after the
removal of 'delete'. Who likes features with unnecessarily clumsy syntax?


The cost of keeping "delete" in the language is making a rarely-used,
dangerous facility with loosely-defined semantics straight inside the
language. It reflects poor language design for many reasons.

Andrei


I think we are talking about different things. I do _not_ want to use the 
keyword
to somehow try to deallocate GC allocated memory. (who would actually want to do
this anyways?) I want it for custom allocators as in
http://www.digitalmars.com/d/2.0/memory.html. In that case, the semantics would 
be
completely specified by me. Removing the keyword means ruining that use case. 
You
can also find a nice, clean and strict semantic specification for delete when
called on GC memory that works on all platforms and with all GC implementations:
"do nothing, GC memory is managed automatically." As it is done now, I agree it 
is
not particularly nice.

Timon


The class-specific allocators are an exceedingly bad design carried over 
from C++. They will be deprecated from D as well.


Even C++, where the features you are talking about have existed for many 
years, is shunning their use. Good applications make very sparingly use 
of "delete" and virtually none uses class-specific allocators. 
Experience with C++ is the best proof that the delete keyword and 
class-specific allocators have no place in the D programming language.



Andrei


Re: Linus with some good observations on garbage collection

2011-04-24 Thread Timon Gehr
Andrei Alexandrescu wrote:
> The point is that some naively believe "new" and "delete" are some
> simple do and undo actions and as such they somehow deserve the same
> level of support. In reality they have dramatically different issues and
> are profoundly asymmetric.


> The class-specific allocators are an exceedingly bad design carried over
> from C++. They will be deprecated from D as well.

Does that imply that the feature is going away after all?

> Even C++, where the features you are talking about have existed for many
> years, is shunning their use. Good applications make very sparingly use
> of "delete" and virtually none uses class-specific allocators.
> Experience with C++ is the best proof that the delete keyword and
> class-specific allocators have no place in the D programming language.
>
>
> Andrei

Virtually no good application uses them? The _DMD compiler implementation_ uses
them. Classes 'Token' and 'Scope' have class-specific optimized allocators. I
don't think Walter would have done that if it did not give quite good benefits. 
D
is a systems programming language after all. It is not Java or some scripting
language. I don't think D should/can get rid of class-specific allocators if 
there
is no good alternative for sane uses. Even if they are not used often and should
not be used often.

C++ is not garbage collected by default. Implementing a GC root class for 
example
is another legitimate use case for class-specific allocators. In D, things are
somewhat reverse, class-specific allocators would be used for manual memory
management. D should support that.

So if class-specific allocators go away, what is the alternative? Sure, it is
always possible to allocate using custom functions (for D structs, at least.) 
But
then, if allocations of one struct/class turn out to be a bottleneck somewhere
late in the development, the whole code will have to be changed instead of just
the addition of some simple custom class-specific allocator. (This is even worse
when developing a library). Also, just testing the impact of a custom allocator
will be a pain for the same reason. Why is that better?


Timon


Re: Linus with some good observations on garbage collection

2011-04-24 Thread Andrei Alexandrescu

On 04/24/2011 01:21 PM, Timon Gehr wrote:

Andrei Alexandrescu wrote:

The point is that some naively believe "new" and "delete" are some
simple do and undo actions and as such they somehow deserve the same
level of support. In reality they have dramatically different issues and
are profoundly asymmetric.




The class-specific allocators are an exceedingly bad design carried over
from C++. They will be deprecated from D as well.


Does that imply that the feature is going away after all?


Even C++, where the features you are talking about have existed for many
years, is shunning their use. Good applications make very sparingly use
of "delete" and virtually none uses class-specific allocators.
Experience with C++ is the best proof that the delete keyword and
class-specific allocators have no place in the D programming language.


Andrei


Virtually no good application uses them? The _DMD compiler implementation_ uses
them. Classes 'Token' and 'Scope' have class-specific optimized allocators. I
don't think Walter would have done that if it did not give quite good benefits.


A class-specific freelist-based allocator in fact stands in the way of a 
good general-purpose allocator. At Facebook we did extensive 
measurements on a variety of applications and workloads and the best 
approach was to just disable all STL class-specific allocators (by 
defining GLIBCXX_FORCE_NEW) and let jemalloc take care of all 
allocations, to great effect.


Today's good general-purpose allocators are as good or usually 
considerably better than custom allocators. The one category of custom 
allocators that is still significantly better are regions, but they have 
quite specific and restricted semantics.



D
is a systems programming language after all. It is not Java or some scripting
language. I don't think D should/can get rid of class-specific allocators if 
there
is no good alternative for sane uses. Even if they are not used often and should
not be used often.

C++ is not garbage collected by default. Implementing a GC root class for 
example
is another legitimate use case for class-specific allocators. In D, things are
somewhat reverse, class-specific allocators would be used for manual memory
management. D should support that.

So if class-specific allocators go away, what is the alternative? Sure, it is
always possible to allocate using custom functions (for D structs, at least.) 
But
then, if allocations of one struct/class turn out to be a bottleneck somewhere
late in the development, the whole code will have to be changed instead of just
the addition of some simple custom class-specific allocator. (This is even worse
when developing a library). Also, just testing the impact of a custom allocator
will be a pain for the same reason. Why is that better?


Timon


It is much better in very many ways, in particular in D. For one simple 
argument, consider:


auto obj = new Widget;

Simple modularity considerations would require this operator to have 
semantics independent of Widget, such that modular and generic code can 
use it properly. Custom class allocators ruin all that, because they 
would require the use of delete paired with new, whereas the general 
operator requires no action.


Memory allocation and object construction are difficult topics, 
particularly in high-performance settings. A seasoned engineer 
understands that and does not naively believe that a smattering of 
well-intended but very badly executed language support will make that 
all easier. (Contrast that with e.g. reference counting semantics using 
RAII, which relies on a much better part of the language and does offer 
reasonable guarantees.)


I am sorry, you simply have no case - each and every argument you put 
forth has no strength or is just wrong. We could spend time on debating 
each, but I suspect that would do little toward convincing you of what 
is after all a simple fact, but one with many subtle facets. It might 
not a good use of our time to further engage in a diatribe on this. The 
delete keyword will go, as will class-specific new and delete.



Andrei


Re: Linus with some good observations on garbage collection

2011-04-24 Thread Andrej Mitrovic
Since this is definite we should remove them from the docs as well.
Want me to file it to bugzilla?


Re: Linus with some good observations on garbage collection

2011-04-24 Thread bearophile
Andrei:

> It might not a good use of our time to further engage in a diatribe on this.

Just a note: you have not wasted your time discussing this, because I am not 
expert on such matters, so I learn something from informed discussions :-)

If a book that explains C language is just 200 pages long, a book that explains 
the design purpose, design constraints and design compromises of every feature 
of C language requires a much longer book (800 pages?), and it's probably also 
much more interesting :-)

Bye,
bearophile


Re: Linus with some good observations on garbage collection

2011-04-24 Thread Francisco Almeida
== Quote from bearophile (bearophileh...@lycos.com)'s article
> Andrei:
> > It might not a good use of our time to further engage in a diatribe on this.
> Just a note: you have not wasted your time discussing this, because I am not
expert on such matters, so I learn something from informed discussions :-)
> If a book that explains C language is just 200 pages long, a book that 
> explains
the design purpose, design constraints and design compromises of every feature 
of
C language requires a much longer book (800 pages?), and it's probably also much
more interesting :-)
> Bye,
> bearophile

As one of those who were the most reluctant about abandoning delete, I have come
to realize that it is better to make D classes strictly garbage collected, and
drop delete.

It should be noted that D supports RAII, as long as one uses struct (thus 
avoiding
dynamic polymorphism, which shouldn't make a big difference once you think about
it), and this gives D an important semantic separation lacking in C++ (or even
those languages on the other side of the fence such as Java, for that matter).

Making a habit of declaring struct instead of class when I intend to create
"objects" on the stack instead of the heap, while resorting to garbage collected
classes when it makes sense to do so, has made my life simpler.

I do believe, however, that there is a recurrent communication error going on 
here.

I would kindly ask the language designers to please update the documentation and
write an article explaining the advantages of the approach (without making the
mistake of turning it into a blind praise of garbage collection, that is).
Newcomers need to know how to obtain garbage collected or RAII data structures
from D, whenever they want to.


Re: Linus with some good observations on garbage collection

2011-04-24 Thread Andrei Alexandrescu

On 04/24/2011 03:35 PM, Andrej Mitrovic wrote:

Since this is definite we should remove them from the docs as well.
Want me to file it to bugzilla?


Please do. BTW I just fixed the core-related documentation bug on 
d-programming-language.org.


Andrei


Re: Linus with some good observations on garbage collection

2011-04-24 Thread Andrej Mitrovic
On 4/24/11, Andrei Alexandrescu  wrote:
> On 04/24/2011 03:35 PM, Andrej Mitrovic wrote:
>> Since this is definite we should remove them from the docs as well.
>> Want me to file it to bugzilla?
>
> Please do. BTW I just fixed the core-related documentation bug on
> d-programming-language.org.
>
> Andrei
>

Fantastic, thank you.


Re: Linus with some good observations on garbage collection

2011-04-24 Thread Piotr Szturmaj

Piotr Szturmaj wrote:

Why not allow "delete" only in classes with custom allocators? AFAIK
they could be determined at compile time. Please let me know if I'm wrong.


Nevermind. Andrei informed that class allocators are being deprecated, 
so my question is no longer relevant.


Re: Linus with some good observations on garbage collection

2011-04-24 Thread Timon Gehr
Andrei Alexandrescu wrote:
> I am sorry, you simply have no case - each and every argument you put
> forth has no strength or is just wrong. We could spend time on debating
> each, but I suspect that would do little toward convincing you of what
> is after all a simple fact, but one with many subtle facets. It might
> not a good use of our time to further engage in a diatribe on this. The
> delete keyword will go, as will class-specific new and delete.
>
>
> Andrei

Ok, lets stop. Custom allocators and delete should probably be removed from D:

Some of my facts:
* freelist is faster if most allocation and deallocation concentrates on only 
one
class. (see attachment) I think this is also the reason it increases DMDs speed.
Token and Scope are quite obvious bottlenecks..
* such bottlenecks will most of the time only be detected when development is
almost finished. (Almost never for library code, so I don't get why STL uses
custom allocators at all.)
* premature optimization is evil (see STL custom allocators)
* late optimization: it is easy if there are custom allocators. In C++ it is 
even
trivial. Nice design. Still no clue why STL is abusing it.
* that form of custom allocators does not play nicely with Ds Garbage collector
though, because it changes the semantics of 'new', ergo builtin custom 
allocators
should probably indeed be removed from D.
* this makes this form of optimization painful in D without custom allocators.
This is what I was instinctively scared about. Fact is: it is also painful with
them, because you don't get the 'delete's for free as in C++. I was somewhat
inconsiderate on this.
* it is possible that hinting the GC is only about twice as bad (again, see
attachment).


And some notes:
* the fact that you think all my points were worthless or wrong implies that you
either did not read all of them or have not considered them yourself when making
your decision. I think that is scary.
* There are many ways in which I am always right. The most dominant being that 
any
sufficiently smart guy with a PhD in rocket science would totally agree with all
my statements. No pun/offense intended.
* this discussion would have been a waste of your time, had it not been public.


Thanks for discussing. It was helpful.


Timon

PS: The current GC dies without hints on my benchmark, so it is indeed a nice 
idea
to keep hints:
version=GC:   1m45s
version=GC_hints: ~6.8s
version=freelist: ~3.7s
version=c_heap:   ~4.5s
begin 644 benchmark.d
M+RH@0F5N8VAM87)K($=#('9S($=#("8@:&EN=',@=G,@9G)E96QI#(N-"!':%H@;VX@56)U;G1U(#$P+C`T("AL=6-I9"D@06UD-C0*("H@=F5R
MF5?="!S*7L*"0D):68H9G)E96QIPH)"0D)=F]I9"`JPH)"0D)9F]O(&]L9&9L/69R965L:7-T.PH)"0D)9G)E96QI
M'0];VQD9FP["@D)"7T*
M"0E]"@E]"@EV97)S:6]N*&-?:&5A<"E["@D);F5W*'-I>F5?="!S*7MR971U
M'!E8W1E9"!S:')I;FMI;F<*"69O

Re: Linus with some good observations on garbage collection

2011-04-24 Thread Iain Buclaw
== Quote from Timon Gehr (timon.g...@gmx.ch)'s article
> Andrei Alexandrescu wrote:
> > I am sorry, you simply have no case - each and every argument you put
> > forth has no strength or is just wrong. We could spend time on debating
> > each, but I suspect that would do little toward convincing you of what
> > is after all a simple fact, but one with many subtle facets. It might
> > not a good use of our time to further engage in a diatribe on this. The
> > delete keyword will go, as will class-specific new and delete.
> >
> >
> > Andrei

Is it correct that the current D GC implementation puts delete'd allocations 
into
a freelist (or bucket) to be reused when 'new' is invoked again?

> Ok, lets stop. Custom allocators and delete should probably be removed from D:
> Some of my facts:
> * freelist is faster if most allocation and deallocation concentrates on only 
> one
> class. (see attachment) I think this is also the reason it increases DMDs 
> speed.
> Token and Scope are quite obvious bottlenecks..
> * such bottlenecks will most of the time only be detected when development is
> almost finished. (Almost never for library code, so I don't get why STL uses
> custom allocators at all.)
> * premature optimization is evil (see STL custom allocators)
> * late optimization: it is easy if there are custom allocators. In C++ it is 
> even
> trivial. Nice design. Still no clue why STL is abusing it.
> * that form of custom allocators does not play nicely with Ds Garbage 
> collector
> though, because it changes the semantics of 'new', ergo builtin custom 
> allocators
> should probably indeed be removed from D.
> * this makes this form of optimization painful in D without custom allocators.
> This is what I was instinctively scared about. Fact is: it is also painful 
> with
> them, because you don't get the 'delete's for free as in C++. I was somewhat
> inconsiderate on this.
> * it is possible that hinting the GC is only about twice as bad (again, see
> attachment).
> And some notes:
> * the fact that you think all my points were worthless or wrong implies that 
> you
> either did not read all of them or have not considered them yourself when 
> making
> your decision. I think that is scary.

I find it nuts that someone considers blaming/shunning language features just to
mask bad runtime semantics they refuse to fix - but maybe it's just my ignorance
on the subject that professes that view.


Re: Linus with some good observations on garbage collection

2011-04-24 Thread Andrei Alexandrescu

On 04/24/2011 08:08 PM, Iain Buclaw wrote:

== Quote from Timon Gehr (timon.g...@gmx.ch)'s article

Andrei Alexandrescu wrote:

I am sorry, you simply have no case - each and every argument you put
forth has no strength or is just wrong. We could spend time on debating
each, but I suspect that would do little toward convincing you of what
is after all a simple fact, but one with many subtle facets. It might
not a good use of our time to further engage in a diatribe on this. The
delete keyword will go, as will class-specific new and delete.


Andrei


Is it correct that the current D GC implementation puts delete'd allocations 
into
a freelist (or bucket) to be reused when 'new' is invoked again?


I don't know. If it does, an additional level of freelists only hurts.


Ok, lets stop. Custom allocators and delete should probably be removed from D:
Some of my facts:
* freelist is faster if most allocation and deallocation concentrates on only 
one
class. (see attachment) I think this is also the reason it increases DMDs speed.
Token and Scope are quite obvious bottlenecks..
* such bottlenecks will most of the time only be detected when development is
almost finished. (Almost never for library code, so I don't get why STL uses
custom allocators at all.)
* premature optimization is evil (see STL custom allocators)
* late optimization: it is easy if there are custom allocators. In C++ it is 
even
trivial. Nice design. Still no clue why STL is abusing it.
* that form of custom allocators does not play nicely with Ds Garbage collector
though, because it changes the semantics of 'new', ergo builtin custom 
allocators
should probably indeed be removed from D.
* this makes this form of optimization painful in D without custom allocators.
This is what I was instinctively scared about. Fact is: it is also painful with
them, because you don't get the 'delete's for free as in C++. I was somewhat
inconsiderate on this.
* it is possible that hinting the GC is only about twice as bad (again, see
attachment).
And some notes:
* the fact that you think all my points were worthless or wrong implies that you
either did not read all of them or have not considered them yourself when making
your decision. I think that is scary.


I find it nuts that someone considers blaming/shunning language features just to
mask bad runtime semantics they refuse to fix - but maybe it's just my ignorance
on the subject that professes that view.


I'd appreciate if you provided more detail on that. Far as I can tell 
the problem is with the feature under discussion, not some alleged 
unwillingness to do work.



Andrei


Re: Linus with some good observations on garbage collection

2011-04-24 Thread Robert Jacques
On Sun, 24 Apr 2011 21:13:43 -0400, Andrei Alexandrescu  
 wrote:



On 04/24/2011 08:08 PM, Iain Buclaw wrote:

== Quote from Timon Gehr (timon.g...@gmx.ch)'s article

Andrei Alexandrescu wrote:

I am sorry, you simply have no case - each and every argument you put
forth has no strength or is just wrong. We could spend time on  
debating

each, but I suspect that would do little toward convincing you of what
is after all a simple fact, but one with many subtle facets. It might
not a good use of our time to further engage in a diatribe on this.  
The

delete keyword will go, as will class-specific new and delete.


Andrei


Is it correct that the current D GC implementation puts delete'd  
allocations into

a freelist (or bucket) to be reused when 'new' is invoked again?


I don't know. If it does, an additional level of freelists only hurts.


To the best of my knowledge, for 'small' allocations, the current GC uses  
free-lists of power-of-2 sized blocks. (I think small is defined as  
anything under 4k)


Re: Linus with some good observations on garbage collection

2011-04-24 Thread Andrei Alexandrescu

On 04/24/2011 07:10 PM, Timon Gehr wrote:

Andrei Alexandrescu wrote:

I am sorry, you simply have no case - each and every argument you put
forth has no strength or is just wrong. We could spend time on debating
each, but I suspect that would do little toward convincing you of what
is after all a simple fact, but one with many subtle facets. It might
not a good use of our time to further engage in a diatribe on this. The
delete keyword will go, as will class-specific new and delete.


Andrei


Ok, lets stop. Custom allocators and delete should probably be removed from D:


Thank you for agreeing.


Some of my facts:
* freelist is faster if most allocation and deallocation concentrates on only 
one
class. (see attachment) I think this is also the reason it increases DMDs speed.
Token and Scope are quite obvious bottlenecks..


I agree. There are two other things that I don't agree with:

1. This particular point does nothing to help the original argument, 
which was to keep the delete keyword (and in particular class 
allocators) in the D programming language.


2. Freelists are not all smiles - have their big problems too. When 
deciding to go with freelist the implementer must also decide on a 
strategy for releasing that memory back to the backend allocator on some 
schedule. Choosing a "good" schedule is already a full-blown problem 
that is best left to the the specialists implementing the backend 
allocator. My colleagues discovered with horror that unless the 
environment variable GLIBCXX_FORCE_NEW was defined, the STL node-based 
containers would _never_ give back memory once allocated.


Memory allocation is difficult, and for any allocator some workloads 
exist that makes it worse than all others. Nevertheless, recent 
developments in general-purpose memory allocators (e.g. the 
Low-Fragmentation Heap on Windows and a slew of allocators on Unix 
including ptmalloc, tcmalloc, jemalloc, and others) have shown that it's 
best to just use a good general-purpose allocator instead of fiddling 
with custom allocation.



* such bottlenecks will most of the time only be detected when development is
almost finished. (Almost never for library code, so I don't get why STL uses
custom allocators at all.)
* premature optimization is evil (see STL custom allocators)
* late optimization: it is easy if there are custom allocators. In C++ it is 
even
trivial. Nice design. Still no clue why STL is abusing it.
* that form of custom allocators does not play nicely with Ds Garbage collector
though, because it changes the semantics of 'new', ergo builtin custom 
allocators
should probably indeed be removed from D.
* this makes this form of optimization painful in D without custom allocators.
This is what I was instinctively scared about. Fact is: it is also painful with
them, because you don't get the 'delete's for free as in C++. I was somewhat
inconsiderate on this.
* it is possible that hinting the GC is only about twice as bad (again, see
attachment).


OK.

Allow me to add a speculation here. I believe that one mistake that both 
class-level new/delete and STL allocators have done was to associate 
allocators with types. That just seems to force the proverbial square 
peg through the round hole. General-purpose allocators just focus on 
size classes and do awfully well. In addition, they do that in an 
integrated, non-modular way (as work on HeapLayers has shown, 
modularizing allocators is difficult and not very intuitive).



And some notes:
* the fact that you think all my points were worthless or wrong implies that you
either did not read all of them or have not considered them yourself when making
your decision. I think that is scary.


Well I think I owe you an apology. It is very difficult to use limited 
time resources to convince someone of rather subtle matters, 
particularly if the discussion starts not with a question (e.g. "In 
light of what I said, I wonder whether removing delete is the right way 
to go."), but with an assertion. At that point I need to make a sort of 
a cold estimate - do I have a reasonable chance of convincing someone of 
something within a reasonable time frame? It's difficult to make and 
convey that decision without seeming impolite, although please believe 
me being curt pains me as much as anyone.


To make matters more difficult, I'm not an expert in memory allocation 
and garbage collection. Probably I'm just at the point I can recognise 
good work when I see it. Yet, after a book chapter on memory allocation, 
a couple of articles written on memory allocation, a course I'm teaching 
on memory allocation, and a good amount of work on STL's interaction 
with jemalloc while at Facebook - I must admit I've become a bit jaded 
and as such less sanguine about arguing my points.



* There are many ways in which I am always right. The most dominant being that 
any
sufficiently smart guy with a PhD in rocket science would totally agree with all
my statements. No pun/offense int

Re: Linus with some good observations on garbage collection

2011-04-25 Thread Iain Buclaw
== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article
> On 04/24/2011 08:08 PM, Iain Buclaw wrote:
> > == Quote from Timon Gehr (timon.g...@gmx.ch)'s article
> >> Andrei Alexandrescu wrote:
> >>> I am sorry, you simply have no case - each and every argument you put
> >>> forth has no strength or is just wrong. We could spend time on debating
> >>> each, but I suspect that would do little toward convincing you of what
> >>> is after all a simple fact, but one with many subtle facets. It might
> >>> not a good use of our time to further engage in a diatribe on this. The
> >>> delete keyword will go, as will class-specific new and delete.
> >>>
> >>>
> >>> Andrei
> >
> > Is it correct that the current D GC implementation puts delete'd 
> > allocations into
> > a freelist (or bucket) to be reused when 'new' is invoked again?
> I don't know. If it does, an additional level of freelists only hurts.
> >> Ok, lets stop. Custom allocators and delete should probably be removed 
> >> from D:
> >> Some of my facts:
> >> * freelist is faster if most allocation and deallocation concentrates on 
> >> only one
> >> class. (see attachment) I think this is also the reason it increases DMDs 
> >> speed.
> >> Token and Scope are quite obvious bottlenecks..
> >> * such bottlenecks will most of the time only be detected when development 
> >> is
> >> almost finished. (Almost never for library code, so I don't get why STL 
> >> uses
> >> custom allocators at all.)
> >> * premature optimization is evil (see STL custom allocators)
> >> * late optimization: it is easy if there are custom allocators. In C++ it 
> >> is even
> >> trivial. Nice design. Still no clue why STL is abusing it.
> >> * that form of custom allocators does not play nicely with Ds Garbage 
> >> collector
> >> though, because it changes the semantics of 'new', ergo builtin custom 
> >> allocators
> >> should probably indeed be removed from D.
> >> * this makes this form of optimization painful in D without custom 
> >> allocators.
> >> This is what I was instinctively scared about. Fact is: it is also painful 
> >> with
> >> them, because you don't get the 'delete's for free as in C++. I was 
> >> somewhat
> >> inconsiderate on this.
> >> * it is possible that hinting the GC is only about twice as bad (again, see
> >> attachment).
> >> And some notes:
> >> * the fact that you think all my points were worthless or wrong implies 
> >> that you
> >> either did not read all of them or have not considered them yourself when 
> >> making
> >> your decision. I think that is scary.
> >
> > I find it nuts that someone considers blaming/shunning language features 
> > just to
> > mask bad runtime semantics they refuse to fix - but maybe it's just my 
> > ignorance
> > on the subject that professes that view.
> I'd appreciate if you provided more detail on that. Far as I can tell
> the problem is with the feature under discussion, not some alleged
> unwillingness to do work.
> Andrei

I was thinking of where one implements an alternative runtime that may not
necessarily interface with a GC, or has different semantics on the use of the
keyword (the compiler, after all, doesn't know/care what exactly 'delete' does, 
it
just does it's job at emitting the correct action we expect from it). Though it
was late night and I perhaps didn't quite put it right anyway.

Having literally slept on it though, a runtime with a good GC is one that 
doesn't
offer a choice to 'delete' memory - so I am on your side.  Though reflecting 
from
my experience with the D2 GC (thinking from a compiler implementers POV), there
isn't much I can really praise about it.

Custom (de)allocators I can also see a very good point for removal too, however
much a stab in the back for those who can use them in the - perhaps 
'one-and-only'
- case for a small amount of speed gain.

Iain


Re: Linus with some good observations on garbage collection

2011-04-25 Thread Andrei Alexandrescu

On 04/25/2011 10:06 AM, Iain Buclaw wrote:

I was thinking of where one implements an alternative runtime that may not
necessarily interface with a GC, or has different semantics on the use of the
keyword (the compiler, after all, doesn't know/care what exactly 'delete' does, 
it
just does it's job at emitting the correct action we expect from it). Though it
was late night and I perhaps didn't quite put it right anyway.

Having literally slept on it though, a runtime with a good GC is one that 
doesn't
offer a choice to 'delete' memory - so I am on your side.  Though reflecting 
from
my experience with the D2 GC (thinking from a compiler implementers POV), there
isn't much I can really praise about it.

Custom (de)allocators I can also see a very good point for removal too, however
much a stab in the back for those who can use them in the - perhaps 
'one-and-only'
- case for a small amount of speed gain.

Iain


I understand. The right response to this is defining a good custom 
allocator API and providing good implementations of it.


Andrei


Re: Linus with some good observations on garbage collection

2011-04-25 Thread Sean Kelly
On Apr 24, 2011, at 7:25 PM, Robert Jacques wrote:
> 
> To the best of my knowledge, for 'small' allocations, the current GC uses 
> free-lists of power-of-2 sized blocks. (I think small is defined as anything 
> under 4k)

That's correct.

Re: Linus with some good observations on garbage collection

2011-04-25 Thread Sean Kelly
On Apr 24, 2011, at 1:17 PM, Andrei Alexandrescu wrote:
> 
> It is much better in very many ways, in particular in D. For one simple 
> argument, consider:
> 
> auto obj = new Widget;
> 
> Simple modularity considerations would require this operator to have 
> semantics independent of Widget, such that modular and generic code can use 
> it properly. Custom class allocators ruin all that, because they would 
> require the use of delete paired with new, whereas the general operator 
> requires no action.

Agreed, so long as there's some form of support for placement new.  But that 
could easily be an explicit __ctor() call into the allocated space, which works 
in D but I don't believe it works in C++ (thus the explicit placement new 
syntax).

Re: Linus with some good observations on garbage collection

2011-04-25 Thread Kim D
> On 04/25/2011 10:06 AM, Iain Buclaw wrote:
>> I was thinking of where one implements an alternative runtime that may 
>> not
>> necessarily interface with a GC, or has different semantics on the use of 
>> the
>> keyword (the compiler, after all, doesn't know/care what exactly 'delete' 
>> does, it
>> just does it's job at emitting the correct action we expect from it). 
>> Though it
>> was late night and I perhaps didn't quite put it right anyway.
>>
>> Having literally slept on it though, a runtime with a good GC is one that 
>> doesn't
>> offer a choice to 'delete' memory - so I am on your side.  Though 
>> reflecting from
>> my experience with the D2 GC (thinking from a compiler implementers POV), 
>> there
>> isn't much I can really praise about it.
>>
>> Custom (de)allocators I can also see a very good point for removal too, 
>> however
>> much a stab in the back for those who can use them in the - perhaps 
>> 'one-and-only'
>> - case for a small amount of speed gain.
>>
>> Iain
>
> I understand. The right response to this is defining a good custom 
> allocator API and providing good implementations of it.
>
> Andrei

Sounds like a good way forward. However, certain operations are currently 
not supported unless the GC is enabled, for instance 'Associative Arrays of 
Strings'... do you imagine the future allocator API beeing flexible enough 
to facilitate this use-case in a GC-less environment? Or at least allowing 
more of the functionality which is currently unsupported without GC?

Kim




Re: Linus with some good observations on garbage collection

2011-04-25 Thread Andrei Alexandrescu

On 4/25/11 4:20 PM, Kim D wrote:

On 04/25/2011 10:06 AM, Iain Buclaw wrote:

I was thinking of where one implements an alternative runtime that may
not
necessarily interface with a GC, or has different semantics on the use of
the
keyword (the compiler, after all, doesn't know/care what exactly 'delete'
does, it
just does it's job at emitting the correct action we expect from it).
Though it
was late night and I perhaps didn't quite put it right anyway.

Having literally slept on it though, a runtime with a good GC is one that
doesn't
offer a choice to 'delete' memory - so I am on your side.  Though
reflecting from
my experience with the D2 GC (thinking from a compiler implementers POV),
there
isn't much I can really praise about it.

Custom (de)allocators I can also see a very good point for removal too,
however
much a stab in the back for those who can use them in the - perhaps
'one-and-only'
- case for a small amount of speed gain.

Iain


I understand. The right response to this is defining a good custom
allocator API and providing good implementations of it.

Andrei


Sounds like a good way forward. However, certain operations are currently
not supported unless the GC is enabled, for instance 'Associative Arrays of
Strings'... do you imagine the future allocator API beeing flexible enough
to facilitate this use-case in a GC-less environment? Or at least allowing
more of the functionality which is currently unsupported without GC?

Kim


An entity relying on GC will by definition have more freedom and more 
safety than one relying on an alternative allocator. It is possible to 
make that transparent to some extent (e.g. define a type to be either 
refcounted or not depending on the allocator used). Total transparency 
would be difficult to achieve.


Andrei


Re: Linus with some good observations on garbage collection

2011-04-29 Thread Bruno Medeiros

On 23/04/2011 15:45, Andrei Alexandrescu wrote:

On 4/23/11 8:57 AM, dsimcha wrote:

BTW, since when does the ternary operator work with functions, as
opposed to variables?




And that's since the C days btw. The function call is just an operation, 
the callee an operand, so any expression can be there.



They are converted to pointers to functions. Not something I'd recommend
because it makes the call slower.

Andrei


Hum, I didn't know that. Why does it make the call slower (other than 
the actual ternary operation), is it because it has to load the function 
address from a register/variable, instead of being a constant value? 
And/or not being able to inline the call?
I mean, the first aspect seems like a very minor impact in performance, 
almost negligible.



--
Bruno Medeiros - Software Engineer


Re: Linus with some good observations on garbage collection

2011-04-29 Thread Andrei Alexandrescu

On 4/29/11 10:40 AM, Bruno Medeiros wrote:

On 23/04/2011 15:45, Andrei Alexandrescu wrote:

On 4/23/11 8:57 AM, dsimcha wrote:

BTW, since when does the ternary operator work with functions, as
opposed to variables?




And that's since the C days btw. The function call is just an operation,
the callee an operand, so any expression can be there.


They are converted to pointers to functions. Not something I'd recommend
because it makes the call slower.

Andrei


Hum, I didn't know that. Why does it make the call slower (other than
the actual ternary operation), is it because it has to load the function
address from a register/variable, instead of being a constant value?
And/or not being able to inline the call?
I mean, the first aspect seems like a very minor impact in performance,
almost negligible.


It's an indirect call instead of a direct call.

Andrei


Re: Linus with some good observations on garbage collection

2011-05-03 Thread Bruno Medeiros

On 29/04/2011 17:30, Andrei Alexandrescu wrote:

On 4/29/11 10:40 AM, Bruno Medeiros wrote:

On 23/04/2011 15:45, Andrei Alexandrescu wrote:

On 4/23/11 8:57 AM, dsimcha wrote:

BTW, since when does the ternary operator work with functions, as
opposed to variables?




And that's since the C days btw. The function call is just an operation,
the callee an operand, so any expression can be there.


They are converted to pointers to functions. Not something I'd recommend
because it makes the call slower.

Andrei


Hum, I didn't know that. Why does it make the call slower (other than
the actual ternary operation), is it because it has to load the function
address from a register/variable, instead of being a constant value?
And/or not being able to inline the call?
I mean, the first aspect seems like a very minor impact in performance,
almost negligible.


It's an indirect call instead of a direct call.

Andrei


Well, yes, that's kinda of what I was thinking already (although I 
referred to more low-level, assembler terms).
But my question remains, why is an indirect function call slower than a 
direct one, and is that difference significant? (excluding of course the 
possibilities of inlining and other static analysis that direct 
invocations offer) There is no extra overhead other than loading the 
function address from a register/variable, right?



--
Bruno Medeiros - Software Engineer


Re: Linus with some good observations on garbage collection

2011-05-03 Thread Jérôme M. Berger
Bruno Medeiros wrote:
> On 29/04/2011 17:30, Andrei Alexandrescu wrote:
>> On 4/29/11 10:40 AM, Bruno Medeiros wrote:
>>> On 23/04/2011 15:45, Andrei Alexandrescu wrote:
 On 4/23/11 8:57 AM, dsimcha wrote:
> BTW, since when does the ternary operator work with functions, as
> opposed to variables?

>>>
>>> And that's since the C days btw. The function call is just an operation,
>>> the callee an operand, so any expression can be there.
>>>
 They are converted to pointers to functions. Not something I'd
 recommend
 because it makes the call slower.

 Andrei
>>>
>>> Hum, I didn't know that. Why does it make the call slower (other than
>>> the actual ternary operation), is it because it has to load the function
>>> address from a register/variable, instead of being a constant value?
>>> And/or not being able to inline the call?
>>> I mean, the first aspect seems like a very minor impact in performance,
>>> almost negligible.
>>
>> It's an indirect call instead of a direct call.
>>
>> Andrei
> 
> Well, yes, that's kinda of what I was thinking already (although I
> referred to more low-level, assembler terms).
> But my question remains, why is an indirect function call slower than a
> direct one, and is that difference significant? (excluding of course the
> possibilities of inlining and other static analysis that direct
> invocations offer) There is no extra overhead other than loading the
> function address from a register/variable, right?
> 
AIUI, with a direct call the CPU can anticipate the call (because
it knows the target address) and start filling the pipeline
immediately. OTOH, with an indirect call the CPU cannot know the
target until it reads the register, which means that the pipeline
will empty itself before the call is executed. With the pipeline
length of some modern CPUs (Pentium 4 being amongst the worst), this
can result in very large slowdowns.

Jerome
-- 
mailto:jeber...@free.fr
http://jeberger.free.fr
Jabber: jeber...@jabber.fr



signature.asc
Description: OpenPGP digital signature


Re: Linus with some good observations on garbage collection

2011-05-04 Thread Bruno Medeiros

On 03/05/2011 22:41, "Jérôme M. Berger" wrote:

Bruno Medeiros wrote:

On 29/04/2011 17:30, Andrei Alexandrescu wrote:

On 4/29/11 10:40 AM, Bruno Medeiros wrote:

On 23/04/2011 15:45, Andrei Alexandrescu wrote:

On 4/23/11 8:57 AM, dsimcha wrote:

BTW, since when does the ternary operator work with functions, as
opposed to variables?




And that's since the C days btw. The function call is just an operation,
the callee an operand, so any expression can be there.


They are converted to pointers to functions. Not something I'd
recommend
because it makes the call slower.

Andrei


Hum, I didn't know that. Why does it make the call slower (other than
the actual ternary operation), is it because it has to load the function
address from a register/variable, instead of being a constant value?
And/or not being able to inline the call?
I mean, the first aspect seems like a very minor impact in performance,
almost negligible.


It's an indirect call instead of a direct call.

Andrei


Well, yes, that's kinda of what I was thinking already (although I
referred to more low-level, assembler terms).
But my question remains, why is an indirect function call slower than a
direct one, and is that difference significant? (excluding of course the
possibilities of inlining and other static analysis that direct
invocations offer) There is no extra overhead other than loading the
function address from a register/variable, right?


AIUI, with a direct call the CPU can anticipate the call (because
it knows the target address) and start filling the pipeline
immediately. OTOH, with an indirect call the CPU cannot know the
target until it reads the register, which means that the pipeline
will empty itself before the call is executed. With the pipeline
length of some modern CPUs (Pentium 4 being amongst the worst), this
can result in very large slowdowns.

Jerome


Ah right, that makes sense, forgot about that one (I see it's the same 
issue with branch prediction).


--
Bruno Medeiros - Software Engineer


Re: Linus with some good observations on garbage collection

2011-05-04 Thread Don

Jérôme M. Berger wrote:

Bruno Medeiros wrote:

On 29/04/2011 17:30, Andrei Alexandrescu wrote:

On 4/29/11 10:40 AM, Bruno Medeiros wrote:

On 23/04/2011 15:45, Andrei Alexandrescu wrote:

On 4/23/11 8:57 AM, dsimcha wrote:

BTW, since when does the ternary operator work with functions, as
opposed to variables?

And that's since the C days btw. The function call is just an operation,
the callee an operand, so any expression can be there.


They are converted to pointers to functions. Not something I'd
recommend
because it makes the call slower.

Andrei

Hum, I didn't know that. Why does it make the call slower (other than
the actual ternary operation), is it because it has to load the function
address from a register/variable, instead of being a constant value?
And/or not being able to inline the call?
I mean, the first aspect seems like a very minor impact in performance,
almost negligible.

It's an indirect call instead of a direct call.

Andrei

Well, yes, that's kinda of what I was thinking already (although I
referred to more low-level, assembler terms).
But my question remains, why is an indirect function call slower than a
direct one, and is that difference significant? (excluding of course the
possibilities of inlining and other static analysis that direct
invocations offer) There is no extra overhead other than loading the
function address from a register/variable, right?


AIUI, with a direct call the CPU can anticipate the call (because
it knows the target address) and start filling the pipeline
immediately. OTOH, with an indirect call the CPU cannot know the
target until it reads the register, which means that the pipeline
will empty itself before the call is executed. With the pipeline
length of some modern CPUs (Pentium 4 being amongst the worst), this
can result in very large slowdowns.

Jerome


That was true ten years ago, but not any more. Modern CPUs can do branch 
prediction of indirect jumps.


Re: Linus with some good observations on garbage collection

2011-05-04 Thread dsimcha

On 5/4/2011 7:38 PM, Don wrote:

Jérôme M. Berger wrote:

Bruno Medeiros wrote:

On 29/04/2011 17:30, Andrei Alexandrescu wrote:

On 4/29/11 10:40 AM, Bruno Medeiros wrote:

On 23/04/2011 15:45, Andrei Alexandrescu wrote:

On 4/23/11 8:57 AM, dsimcha wrote:

BTW, since when does the ternary operator work with functions, as
opposed to variables?

And that's since the C days btw. The function call is just an
operation,
the callee an operand, so any expression can be there.


They are converted to pointers to functions. Not something I'd
recommend
because it makes the call slower.

Andrei

Hum, I didn't know that. Why does it make the call slower (other than
the actual ternary operation), is it because it has to load the
function
address from a register/variable, instead of being a constant value?
And/or not being able to inline the call?
I mean, the first aspect seems like a very minor impact in
performance,
almost negligible.

It's an indirect call instead of a direct call.

Andrei

Well, yes, that's kinda of what I was thinking already (although I
referred to more low-level, assembler terms).
But my question remains, why is an indirect function call slower than a
direct one, and is that difference significant? (excluding of course the
possibilities of inlining and other static analysis that direct
invocations offer) There is no extra overhead other than loading the
function address from a register/variable, right?


AIUI, with a direct call the CPU can anticipate the call (because
it knows the target address) and start filling the pipeline
immediately. OTOH, with an indirect call the CPU cannot know the
target until it reads the register, which means that the pipeline
will empty itself before the call is executed. With the pipeline
length of some modern CPUs (Pentium 4 being amongst the worst), this
can result in very large slowdowns.

Jerome


That was true ten years ago, but not any more. Modern CPUs can do branch
prediction of indirect jumps.


Yeh, I tested this a while back and can dig up/recreate the benchmark if 
need be.  A virtual function call that's always resolving to the same 
address (so the branch is highly predictable) is not measurably slower 
than a _non-inlined_ direct function call.  However, if the branch is 
not predictable, it's substantially slower and virtual functions usually 
can't be inlined.