On Wednesday, 3 September 2025 at 23:05:45 UTC, H. S. Teoh wrote:
On Wed, Sep 03, 2025 at 07:56:03PM +0000, Brother Bill via
Digitalmars-d-learn wrote:
[...]
C, C++ and D can play shenanigans with pointers, such as
casting them to size_t, which hides them from the GC.
D's current GC is conservative, meaning that any value it sees
that looks like it might be a pointer value, will be regarded
as a pointer value.
There is an optional precise GC that has been implemented, that
can be turned on with compiled-in options or command line
options, which uses a slightly less conservative scheme.
The recommendation is avoid *only* storing data in `size_t` that
points to an allocated block.
Even without the precise collector, the GC has pointer containing
blocks and no-pointer blocks. this means that it's quite easy to
accidentally only store a pointer in a `size_t` that will not be
scanned, even with the conservative GC.
You should only store pointers as `size_t` "if you know what you
are doing". Otherwise do not do this.
It is fine to make a temporary *copy* of a pointer to a `size_t`
for example to examine the bits inside. This should leave the
original pointer alone.
[...]
GC.calloc can allocate memory for a slice of MyClass
instances. The developer may run GC.free to free the
allocated memory. But GC may perform its own garbage
collection of GC allocated memory blocks.
`GC.free` is going to free the memory. It will NOT run
finalizers. It will not collect it again later. I want to make
that clear.
If you do not explicitly free the memory, and it becomes garbage,
then the GC will collect it.
As far as a slice of `MyClass` instances, if you mean a slice of
data that contains the fields of an array of classes, you should
be very cautious of this. The GC is not equipped to call
finalizers on such a structure, and so you likely will run into
lifetime issues.
For classes, I'd just stick with `new`.
For structs, you can quite easily allocate an array of structs,
and the GC can support finalization of that. Also recommend just
using `new`.
Let's look at each attribute: (confirm if my analysis is
right,
otherwise correct)
FINALIZE - just before GC reclaims the memory, such as with
GC.free,
call destructors, aka finalizers.
This bit is probably best left untouched by user code, and left
to the runtime to figure out when/how to use it.
In the latest compiler (2.111), this has been changed to a bit
that requests finalization upon allocation. The GC uses this bit
and the typeinfo passed in to determine the correct action. This
is different from before where the bit was an implementation
detail that you had to know what you are asking for.
I do agree that you should basically leave this alone. But for
sure the new treatment of the bit is more robust than before.
Note: changing bits after allocation *does not* take this into
account, at that point you are modifying implementation details.
I really would like to get rid of these bits completely and use
more reliable API (having a set of implementation bits as an
option is quite dangerous).
NO_SCAN - There may be false positives regarding byte values
that look like 'new' allocated pointers. This can result in
'garbage' memory not being collected. If we are CERTAIN that
this memory block doesn't contain any pointers to 'new'
SomeClass allocated memory, then mark as NO_SCAN.
Correct. Though if you're writing idiomatic D code, you'll
almost never need to worry about this. Whenever you allocate
an array whose elements are PODs (without any pointers), the
allocator will automatically mark the memory NO_SCAN so that
the GC doesn't waste time scanning such blocks. So things like
implicit string allocations will be marked NO_SCAN, etc. If
you're allocating an array or object that contains
indirections, then NO_SCAN will not be set, so the GC will scan
the interior of suc blocks for pointers to other live objects.
I will add that the concern of scanning non-pointers is pretty
much obsolete with 64-bit addressing. It's still important to use
`NO_SCAN`, as it's quite common to allocate large blocks of data
that are just bytes (e.g. load a file). You don't want to waste
time scanning that, even if there are no false-positives to be
found in there.
Question 1: if GC-calloc has allocated MyClass that
has a
string 'name' member, which may expand in size,
would be
still properly apply NO_SCAN.
I would say this is not true. A string has a pointer, it should
be scanned.
Question 2: if GC-calloc has allocated MyClass,
which may
allocate new MyStudent(...), would that mean 'don't
apply
NO_SCAN'?
It's very simple. If a memory block may contain pointers, then
it should not be NO_SCAN. If a memory block never contains any
pointers, then it can (should) be marked NO_SCAN.
100% correct.
Normal D code does not need to fiddle with GC flags.
Great advice!
NO_MOVE - For GC.realloc, if increasing memory allocated, and
it's not available, throw 'MEMORY_NOT_AVAILABLE' exception.
Correct. You might want to use this flag if you have non-D code
that might be holding pointers to this memory block, e.g., if
you passed a pointer to some D array to C code which retains it
in some C-managed pointer, and the C code expects the array to
still be there later.
It's not very often that such situations come up, though. When
passing GC-allocated data to C code, it's generally a good idea
to keep a reference to it inside D code so that the GC can find
the reference anyway. Since D doesn't have a moving GC, this
is really all you need to do. Again, unless you're doing
something unusual, you probably don't need to touch the NO_MOVE
flag.
No, this is not correct. `NO_MOVE` is supposed to mean that a
moving GC cannot move this block (and fix up pointers to it).
Given that we have a conservative GC, which scans the stack
conservatively *including C stacks*, and we will always have one,
I would say this bit should just be deprecated.
Indeed, it is completely ignored in the current GC.
APPENDABLE - For D internal runtime use. Don't mark this
yourself.
Yes.
Also improved with D 2.111. The `APPENDABLE` bit is now an input
to malloc that tells the GC this is an array (including adjusting
the size to deal with padding space). The GC now handles array
runtime features directly, and so it understands what this means.
So in fact, this is a bit you can set, and there are currently
unexposed GC interface functions that can be used to manage the
array. They have not yet been exposed in `core.memory`, because
we are not sure if these are the final interfaces we want.
However, *allocating* an array with this bit will do exactly what
you expect (and managing the resulting array with the normal
array management functions such as appending or `capacity` will
work).
I do still recommend using `new`.
NO_INTERIOR - This says that only the base address of the
block may be a target address of other GC allocated pointers.
All other possible pointers are 'false' pointers.
Yes, though I would say it like:
"only pointers found while scanning that point to the exact
target address may be considered pointers to the block."
Again, this is really only of great use in 32-bit addressing.
Perhaps I am missing the fundamentals of various D garbage
collectors.
[...]
The various GC flags are simply hints that let you influence
the scanning process to some extent. The NO_SCAN bit means that
upon reaching this block, don't bother scanning its contents to
find more pointers (because there are none). The NO_INTERIOR
bit means that if the GC finds a pointer-like value that looks
like it points to the inside of this block, ignore it as a
non-pointer, because pointers to this block only ever point to
its head (the supposed pointer is actually not a real pointer,
but an integer value that happens to have a pointer-like value).
The other flags have very specific uses that, if you don't know
what they actually do, you probably don't need them and
shouldn't touch them.
Flags you should be able to use:
* `NO_SCAN`
* `FINALIZE`
* `APPENDABLE`
* `NO_INTERIOR` (very cautiously)
Do not use any other bits directly. A future version of D likely
will migrate these into function parameters instead of providing
bits.
-Steve