GC behavior

2014-10-06 Thread Jonathan via Digitalmars-d
If I pool all unused objects such that no object needs to be 
GC'ed, does it still perform scanning? What are other good ways 
to avoid its overhead? As you might tell, I know rather little 
how D's garbage collection works. I'm working on a game engine 
and trying to be as resource efficient as possible.


FYI, I've been using Rust for the last three months and decided 
to take a break from it. The documentation is far from the 
quality that D has and managing explicit lifetimes becomes a 
serious pain during mid project, especially in cases that you 
know are already safe.


Re: GC behavior

2014-10-06 Thread Daniel Murphy via Digitalmars-d

"Jonathan"  wrote in message news:hsdxglmqtcnhskwzr...@forum.dlang.org...

If I pool all unused objects such that no object needs to be GC'ed, does 
it still perform scanning? What are other good ways to avoid its overhead? 
As you might tell, I know rather little how D's garbage collection works. 
I'm working on a game engine and trying to be as resource efficient as 
possible.


A GC collection will run when you allocate with 'new' and there is not 
enough memory already reserved by the gc.  So if you don't use 'new' then it 
won't run a collection, otherwise it might.  You can disable all collections 
by importing 'core.memory' and calling 'GC.disable()'. 



Re: GC behavior

2014-10-06 Thread Kiith-Sa via Digitalmars-d

On Monday, 6 October 2014 at 16:23:41 UTC, Jonathan wrote:
If I pool all unused objects such that no object needs to be 
GC'ed, does it still perform scanning? What are other good ways 
to avoid its overhead? As you might tell, I know rather little 
how D's garbage collection works. I'm working on a game engine 
and trying to be as resource efficient as possible.


FYI, I've been using Rust for the last three months and decided 
to take a break from it. The documentation is far from the 
quality that D has and managing explicit lifetimes becomes a 
serious pain during mid project, especially in cases that you 
know are already safe.


Suggestion (may or may not be useful depending on your game): use 
an ECS 
(http://t-machine.org/index.php/2007/09/03/entity-systems-are-the-future-of-mmog-development-part-1/) 
approach with big, manually allocated arrays of structs in the 
implementation. I'm working on something but it's not documented 
enough/API is butt-ugly/not nearly stable-enough yet: 
https://github.com/kiith-sa/tharsis-core


Some people are working on other ECS's too, see code.dlang.org 
(some are very efficient, some are not... don't remember which).



And (very simplistic advice, but...) if ECS doesn't fit your 
needs or you do use GC to allocate a lot of stuff anyway, reuse 
dead objects and control the GC by disable()ing/reenabling and 
explicitly fullCollect()ing.


Re: GC behavior

2014-10-06 Thread Jonathan via Digitalmars-d
Kiith-Sa, thanks for the info! I started to check out your entity 
project and love how your incorporating threads and syncing 
new/old state. You did state that your cleaning up things, but my 
initial reaction is that entitymanager is performing too many 
roles. I'd remove the heavy game state and threading into another 
class to make it cleaner, imho.


https://github.com/kiith-sa/tharsis-core/blob/master/source/tharsis/entity/entitymanager.d


Re: GC behavior

2014-10-06 Thread Kiith-Sa via Digitalmars-d

On Monday, 6 October 2014 at 17:46:34 UTC, Jonathan wrote:
Kiith-Sa, thanks for the info! I started to check out your 
entity project and love how your incorporating threads and 
syncing new/old state. You did state that your cleaning up 
things, but my initial reaction is that entitymanager is 
performing too many roles. I'd remove the heavy game state and 
threading into another class to make it cleaner, imho.


https://github.com/kiith-sa/tharsis-core/blob/master/source/tharsis/entity/entitymanager.d


Yeah, the threading is currently the most work-in-progress thing 
(basically, I'm dealing with tradeoffs between trying to get the 
best average performance and least latency from non-realtime OS 
scheduling).


EntityManager is pretty much the kitchen-sink class where things 
exist until they are refactored away (most of Tharsis code used 
to be there at some point in the past). It will probably end up 
just as a shell delegating to all internals regarding 
entit/component management and process execution. However, 
EntityManager API itself is more complicated than needed for most 
users (and needs to be used together with other classes which add 
to the complexity) - it, together with other 'low-level' API 
classes will probably eventually be hidden behind a more 
user-friendly facade (yeah, I hate design patterns, but that's 
pretty much what it will be), with the ability to access 
EntityManager/others for more advanced users.


GC behavior and Allocators

2014-07-03 Thread Brian Schott via Digitalmars-d
Experience has shown that using allocators can drastically 
improve the execution time of D programs. It has also shown that 
the biggest issue with allocators is that unless you are very 
careful, the GC will start freeing your live memory.


I think that we need to define the behavior of 
core.memory.GC.addRange more precisely. Consider the following 
pseudo-code:


auto a = allocate(aSize);
GC.addRange(a, aSize);
auto a = reallocate(a, aSize * 2);
GC.addRange(a, aSize * 2);

What does this do? Well, if the proxy pointer inside of the GC is 
null, it creates two entries. If it's not null and the default GC 
implementation is being used, it results in one entry being 
present. This single entry will only contain the smaller size, 
leaving the GC free to collect memory pointed to by the second 
half of your array.


How about removing? Again, it depends. If the default GC is being 
used there can only be one entry per pointer, so removeRange does 
exactly what you expect. If the proxy pointer is null, 
gc_removeRange falls back to an implementation that scans through 
its ranges array and overwrites the first entry containing a 
matching pointer with the last entry in the array, and then 
decrementing the array length. This reordering means if you have 
multiple entries for various pointers, you don't know which one 
is being removed when you call removeRange. This is fine if you 
like to practice luck-oriented programming.


What about calling removeRange and then addRange immediately? 
This will only work in single-threaded code. In a multi-threaded 
program the GC can run between the calls to removeRange and 
addRange.


I think that the only sane way to solve this is to define in the 
specs for core.memory that GC.addRange will only ever store one 
entry per pointer, and that the length will be the value of "sz" 
from the most recent call to addRange.


Thoughts?


Re: GC behavior and Allocators

2014-07-03 Thread safety0ff via Digitalmars-d

On Thursday, 3 July 2014 at 21:53:25 UTC, Brian Schott wrote:
I think that the only sane way to solve this is to define in 
the specs for core.memory that GC.addRange will only ever store 
one entry per pointer, and that the length will be the value of 
"sz" from the most recent call to addRange.


Thoughts?


Easy to change the behaviour of the 2.066 GC, just force update 
instead of ignoring duplicates:

https://github.com/D-Programming-Language/druntime/blob/master/src/rt/util/container/treap.d#L138

I feel there are instances where core.memory is underspecified.
When I was implementing the update for addRange from array to 
treap it was considered "undefined behaviour" to addRange twice 
with the same base pointer without first removeRange'ing the 
second. That's why the code currently ignores dups.


Boehm GC on the other hand uses an array similar to the previous 
version of the D2 GC, except that it has code for properly 
merging duplicates (this has performance implications though.)


Please keep sharing your insight into practical use of these 
APIs. :)


Re: GC behavior and Allocators

2014-07-04 Thread safety0ff via Digitalmars-d

On Thursday, 3 July 2014 at 21:53:25 UTC, Brian Schott wrote:


I think that the only sane way to solve this is to define in 
the specs for core.memory that GC.addRange will only ever store 
one entry per pointer, and that the length will be the value of 
"sz" from the most recent call to addRange.


Thoughts?


I just thought a little more about this and you will always have 
a race.


Consider this code:
auto a = malloc(aSize);
GC.addRange(a, aSize);
auto b = realloc(a, aSize * 2);

If realloc moves the data (a != b) and the GC runs before you can 
fix up the ranges you can have invalid access during collection.


Re: GC behavior and Allocators

2014-07-04 Thread Kagamin via Digitalmars-d
If you reallocate doubling the size, it's likely such reallocs 
always move, so they should be equivalent to malloc+free, so your 
code can be


mem2 = alloc(sz*2);
mem2[] = mem1[];
addRange(mem2);
removeRange(mem1);
free(mem1);

if not, you need to lock the GC so that it won't interfere during 
realloc.


Re: GC behavior and Allocators

2014-07-04 Thread Chris Cain via Digitalmars-d

On Friday, 4 July 2014 at 10:36:03 UTC, safety0ff wrote:
I just thought a little more about this and you will always 
have a race.


Consider this code:
auto a = malloc(aSize);
GC.addRange(a, aSize);
auto b = realloc(a, aSize * 2);

If realloc moves the data (a != b) and the GC runs before you 
can fix up the ranges you can have invalid access during 
collection.


http://dlang.org/phobos/core_memory.html#.GC.enable

If GC.enable and GC.disable truly disallowed GC running (or 
alternative `GC.hard_disable`/`GC.hard_enable` existed that 
guaranteed such) then you could use that to make sure that the GC 
didn't collect in the middle of a pair of those calls.


On Friday, 4 July 2014 at 11:54:07 UTC, Kagamin wrote:
If you reallocate doubling the size, it's likely such reallocs 
always move, so they should be equivalent to malloc+free, so 
your code can be


mem2 = alloc(sz*2);
mem2[] = mem1[];
addRange(mem2);
removeRange(mem1);
free(mem1);

if not, you need to lock the GC so that it won't interfere 
during realloc.


Is there a way to lock the GC currently?


Re: GC behavior and Allocators

2014-07-04 Thread David Nadlinger via Digitalmars-d

On Friday, 4 July 2014 at 14:47:12 UTC, Chris Cain wrote:

Is there a way to lock the GC currently?


There are critical regions in core.thread. While in such a 
region, your thread will never be suspended, effectively also 
precluding the GC from running. They are a rather dangerous tool 
though, use them only if absolutely required and on small, 
controlled pieces of code.


David


Re: GC behavior and Allocators

2014-07-04 Thread safety0ff via Digitalmars-d

On Friday, 4 July 2014 at 14:47:12 UTC, Chris Cain wrote:


If GC.enable and GC.disable truly disallowed GC running (or 
alternative `GC.hard_disable`/`GC.hard_enable` existed that 
guaranteed such) then you could use that to make sure that the 
GC didn't collect in the middle of a pair of those calls.


Right, this should be sufficient in the current implementation.

On Friday, 4 July 2014 at 17:05:32 UTC, David Nadlinger wrote:


There are critical regions in core.thread. While in such a 
region, your thread will never be suspended, effectively also 
precluding the GC from running. They are a rather dangerous 
tool though, use them only if absolutely required and on small, 
controlled pieces of code.


I think this will deadlock on the GC mutex.


Re: GC behavior and Allocators

2014-07-05 Thread Brian Schott via Digitalmars-d

On Friday, 4 July 2014 at 17:05:32 UTC, David Nadlinger wrote:

On Friday, 4 July 2014 at 14:47:12 UTC, Chris Cain wrote:

Is there a way to lock the GC currently?


There are critical regions in core.thread. While in such a 
region, your thread will never be suspended, effectively also 
precluding the GC from running. They are a rather dangerous 
tool though, use them only if absolutely required and on small, 
controlled pieces of code.


David


So something like this would work?

void* GCAwareRealloc(void* ptr, size_t size)
{
  import core.thread;
  void* r;
  thread_enterCriticalRegion();
  scope (exit) thread_exitCriticalRegion();
  r = realloc(ptr, size);
  // Assuming that addRange performs an update of size
  GC.addRange(r, size);
  return r;
}


Re: GC behavior and Allocators

2014-07-06 Thread safety0ff via Digitalmars-d

On Sunday, 6 July 2014 at 04:58:42 UTC, Brian Schott wrote:


So something like this would work?

void* GCAwareRealloc(void* ptr, size_t size)
{
  import core.thread;
  void* r;
  thread_enterCriticalRegion();
  scope (exit) thread_exitCriticalRegion();
  r = realloc(ptr, size);
  // Assuming that addRange performs an update of size
  GC.addRange(r, size);
  return r;
}


No, it can cause a deadlock:

Thread 1: thread_enterCriticalRegion();
Thread 2: Takes GC lock.
Thread 1: Tries to take GC lock to add range, but it has to wait.
Thread 2: triggers a collection -> thread_suspendAll().

Thread 2 waits on thread 1 to exit critical section, thread 1 
waits on thread 2 to release GC lock.


So the best we can do at the moment is:
void* GCAwareRealloc(void* ptr, size_t size)
{
  void* r;
  GC.disable;
  scope (exit) GC.enable;
  GC.removeRange(ptr);
  r = realloc(ptr, size);
  GC.addRange(r, size);
  return r;
}

With your proposed enhancement we get:
void* GCAwareRealloc(void* ptr, size_t size)
{
  void* r;
  GC.disable;
  scope (exit) GC.enable;
  r = realloc(ptr, size);
  if (r != ptr)
GC.removeRange(ptr);
  GC.addRange(r, size); // if r == ptr, only updates size
  return r;
}

Which executes a little faster because we sometimes do 1 less GC 
range operation (plus 1 less GC lock.)