GC behavior
If I pool all unused objects such that no object needs to be GC'ed, does it still perform scanning? What are other good ways to avoid its overhead? As you might tell, I know rather little how D's garbage collection works. I'm working on a game engine and trying to be as resource efficient as possible. FYI, I've been using Rust for the last three months and decided to take a break from it. The documentation is far from the quality that D has and managing explicit lifetimes becomes a serious pain during mid project, especially in cases that you know are already safe.
Re: GC behavior
"Jonathan" wrote in message news:hsdxglmqtcnhskwzr...@forum.dlang.org... If I pool all unused objects such that no object needs to be GC'ed, does it still perform scanning? What are other good ways to avoid its overhead? As you might tell, I know rather little how D's garbage collection works. I'm working on a game engine and trying to be as resource efficient as possible. A GC collection will run when you allocate with 'new' and there is not enough memory already reserved by the gc. So if you don't use 'new' then it won't run a collection, otherwise it might. You can disable all collections by importing 'core.memory' and calling 'GC.disable()'.
Re: GC behavior
On Monday, 6 October 2014 at 16:23:41 UTC, Jonathan wrote: If I pool all unused objects such that no object needs to be GC'ed, does it still perform scanning? What are other good ways to avoid its overhead? As you might tell, I know rather little how D's garbage collection works. I'm working on a game engine and trying to be as resource efficient as possible. FYI, I've been using Rust for the last three months and decided to take a break from it. The documentation is far from the quality that D has and managing explicit lifetimes becomes a serious pain during mid project, especially in cases that you know are already safe. Suggestion (may or may not be useful depending on your game): use an ECS (http://t-machine.org/index.php/2007/09/03/entity-systems-are-the-future-of-mmog-development-part-1/) approach with big, manually allocated arrays of structs in the implementation. I'm working on something but it's not documented enough/API is butt-ugly/not nearly stable-enough yet: https://github.com/kiith-sa/tharsis-core Some people are working on other ECS's too, see code.dlang.org (some are very efficient, some are not... don't remember which). And (very simplistic advice, but...) if ECS doesn't fit your needs or you do use GC to allocate a lot of stuff anyway, reuse dead objects and control the GC by disable()ing/reenabling and explicitly fullCollect()ing.
Re: GC behavior
Kiith-Sa, thanks for the info! I started to check out your entity project and love how your incorporating threads and syncing new/old state. You did state that your cleaning up things, but my initial reaction is that entitymanager is performing too many roles. I'd remove the heavy game state and threading into another class to make it cleaner, imho. https://github.com/kiith-sa/tharsis-core/blob/master/source/tharsis/entity/entitymanager.d
Re: GC behavior
On Monday, 6 October 2014 at 17:46:34 UTC, Jonathan wrote: Kiith-Sa, thanks for the info! I started to check out your entity project and love how your incorporating threads and syncing new/old state. You did state that your cleaning up things, but my initial reaction is that entitymanager is performing too many roles. I'd remove the heavy game state and threading into another class to make it cleaner, imho. https://github.com/kiith-sa/tharsis-core/blob/master/source/tharsis/entity/entitymanager.d Yeah, the threading is currently the most work-in-progress thing (basically, I'm dealing with tradeoffs between trying to get the best average performance and least latency from non-realtime OS scheduling). EntityManager is pretty much the kitchen-sink class where things exist until they are refactored away (most of Tharsis code used to be there at some point in the past). It will probably end up just as a shell delegating to all internals regarding entit/component management and process execution. However, EntityManager API itself is more complicated than needed for most users (and needs to be used together with other classes which add to the complexity) - it, together with other 'low-level' API classes will probably eventually be hidden behind a more user-friendly facade (yeah, I hate design patterns, but that's pretty much what it will be), with the ability to access EntityManager/others for more advanced users.
GC behavior and Allocators
Experience has shown that using allocators can drastically improve the execution time of D programs. It has also shown that the biggest issue with allocators is that unless you are very careful, the GC will start freeing your live memory. I think that we need to define the behavior of core.memory.GC.addRange more precisely. Consider the following pseudo-code: auto a = allocate(aSize); GC.addRange(a, aSize); auto a = reallocate(a, aSize * 2); GC.addRange(a, aSize * 2); What does this do? Well, if the proxy pointer inside of the GC is null, it creates two entries. If it's not null and the default GC implementation is being used, it results in one entry being present. This single entry will only contain the smaller size, leaving the GC free to collect memory pointed to by the second half of your array. How about removing? Again, it depends. If the default GC is being used there can only be one entry per pointer, so removeRange does exactly what you expect. If the proxy pointer is null, gc_removeRange falls back to an implementation that scans through its ranges array and overwrites the first entry containing a matching pointer with the last entry in the array, and then decrementing the array length. This reordering means if you have multiple entries for various pointers, you don't know which one is being removed when you call removeRange. This is fine if you like to practice luck-oriented programming. What about calling removeRange and then addRange immediately? This will only work in single-threaded code. In a multi-threaded program the GC can run between the calls to removeRange and addRange. I think that the only sane way to solve this is to define in the specs for core.memory that GC.addRange will only ever store one entry per pointer, and that the length will be the value of "sz" from the most recent call to addRange. Thoughts?
Re: GC behavior and Allocators
On Thursday, 3 July 2014 at 21:53:25 UTC, Brian Schott wrote: I think that the only sane way to solve this is to define in the specs for core.memory that GC.addRange will only ever store one entry per pointer, and that the length will be the value of "sz" from the most recent call to addRange. Thoughts? Easy to change the behaviour of the 2.066 GC, just force update instead of ignoring duplicates: https://github.com/D-Programming-Language/druntime/blob/master/src/rt/util/container/treap.d#L138 I feel there are instances where core.memory is underspecified. When I was implementing the update for addRange from array to treap it was considered "undefined behaviour" to addRange twice with the same base pointer without first removeRange'ing the second. That's why the code currently ignores dups. Boehm GC on the other hand uses an array similar to the previous version of the D2 GC, except that it has code for properly merging duplicates (this has performance implications though.) Please keep sharing your insight into practical use of these APIs. :)
Re: GC behavior and Allocators
On Thursday, 3 July 2014 at 21:53:25 UTC, Brian Schott wrote: I think that the only sane way to solve this is to define in the specs for core.memory that GC.addRange will only ever store one entry per pointer, and that the length will be the value of "sz" from the most recent call to addRange. Thoughts? I just thought a little more about this and you will always have a race. Consider this code: auto a = malloc(aSize); GC.addRange(a, aSize); auto b = realloc(a, aSize * 2); If realloc moves the data (a != b) and the GC runs before you can fix up the ranges you can have invalid access during collection.
Re: GC behavior and Allocators
If you reallocate doubling the size, it's likely such reallocs always move, so they should be equivalent to malloc+free, so your code can be mem2 = alloc(sz*2); mem2[] = mem1[]; addRange(mem2); removeRange(mem1); free(mem1); if not, you need to lock the GC so that it won't interfere during realloc.
Re: GC behavior and Allocators
On Friday, 4 July 2014 at 10:36:03 UTC, safety0ff wrote: I just thought a little more about this and you will always have a race. Consider this code: auto a = malloc(aSize); GC.addRange(a, aSize); auto b = realloc(a, aSize * 2); If realloc moves the data (a != b) and the GC runs before you can fix up the ranges you can have invalid access during collection. http://dlang.org/phobos/core_memory.html#.GC.enable If GC.enable and GC.disable truly disallowed GC running (or alternative `GC.hard_disable`/`GC.hard_enable` existed that guaranteed such) then you could use that to make sure that the GC didn't collect in the middle of a pair of those calls. On Friday, 4 July 2014 at 11:54:07 UTC, Kagamin wrote: If you reallocate doubling the size, it's likely such reallocs always move, so they should be equivalent to malloc+free, so your code can be mem2 = alloc(sz*2); mem2[] = mem1[]; addRange(mem2); removeRange(mem1); free(mem1); if not, you need to lock the GC so that it won't interfere during realloc. Is there a way to lock the GC currently?
Re: GC behavior and Allocators
On Friday, 4 July 2014 at 14:47:12 UTC, Chris Cain wrote: Is there a way to lock the GC currently? There are critical regions in core.thread. While in such a region, your thread will never be suspended, effectively also precluding the GC from running. They are a rather dangerous tool though, use them only if absolutely required and on small, controlled pieces of code. David
Re: GC behavior and Allocators
On Friday, 4 July 2014 at 14:47:12 UTC, Chris Cain wrote: If GC.enable and GC.disable truly disallowed GC running (or alternative `GC.hard_disable`/`GC.hard_enable` existed that guaranteed such) then you could use that to make sure that the GC didn't collect in the middle of a pair of those calls. Right, this should be sufficient in the current implementation. On Friday, 4 July 2014 at 17:05:32 UTC, David Nadlinger wrote: There are critical regions in core.thread. While in such a region, your thread will never be suspended, effectively also precluding the GC from running. They are a rather dangerous tool though, use them only if absolutely required and on small, controlled pieces of code. I think this will deadlock on the GC mutex.
Re: GC behavior and Allocators
On Friday, 4 July 2014 at 17:05:32 UTC, David Nadlinger wrote: On Friday, 4 July 2014 at 14:47:12 UTC, Chris Cain wrote: Is there a way to lock the GC currently? There are critical regions in core.thread. While in such a region, your thread will never be suspended, effectively also precluding the GC from running. They are a rather dangerous tool though, use them only if absolutely required and on small, controlled pieces of code. David So something like this would work? void* GCAwareRealloc(void* ptr, size_t size) { import core.thread; void* r; thread_enterCriticalRegion(); scope (exit) thread_exitCriticalRegion(); r = realloc(ptr, size); // Assuming that addRange performs an update of size GC.addRange(r, size); return r; }
Re: GC behavior and Allocators
On Sunday, 6 July 2014 at 04:58:42 UTC, Brian Schott wrote: So something like this would work? void* GCAwareRealloc(void* ptr, size_t size) { import core.thread; void* r; thread_enterCriticalRegion(); scope (exit) thread_exitCriticalRegion(); r = realloc(ptr, size); // Assuming that addRange performs an update of size GC.addRange(r, size); return r; } No, it can cause a deadlock: Thread 1: thread_enterCriticalRegion(); Thread 2: Takes GC lock. Thread 1: Tries to take GC lock to add range, but it has to wait. Thread 2: triggers a collection -> thread_suspendAll(). Thread 2 waits on thread 1 to exit critical section, thread 1 waits on thread 2 to release GC lock. So the best we can do at the moment is: void* GCAwareRealloc(void* ptr, size_t size) { void* r; GC.disable; scope (exit) GC.enable; GC.removeRange(ptr); r = realloc(ptr, size); GC.addRange(r, size); return r; } With your proposed enhancement we get: void* GCAwareRealloc(void* ptr, size_t size) { void* r; GC.disable; scope (exit) GC.enable; r = realloc(ptr, size); if (r != ptr) GC.removeRange(ptr); GC.addRange(r, size); // if r == ptr, only updates size return r; } Which executes a little faster because we sometimes do 1 less GC range operation (plus 1 less GC lock.)