On Thu, 01 Apr 2010 01:41:02 -0400, Mike S
<mi...@notarealaddresslololololol.com> wrote:
Steven Schveighoffer wrote:
> What do you mean by nondeterministic? It's very deterministic, just
not
always easy to determine ;) However, given enough context, it's really
easy to determine.
When I say deterministic, I'm referring to determinism from the user's
point of view, where the allocation behavior is affected solely by the
parameter (the size request, e.g. 10000 objects) and not by some kind of
internal state, hidden context, or arcane black magic. :p
Its abstracted to the GC, but the current GC is well defined. If you
request to allocate blocks with length of a power of 2 under a page, you
will get exactly that length, all the way down to 16 bytes. If you
request to allocate a page or greater, you get a contiguous block of
memory that is a multiple of a page.
With that definition, is the allocator deterministic enough for your needs?
> The amount of memory given is determined by the GC, and ultimately by
the OS. The currently supported OSes allocate in Page-sized chunks, so
when you allocate any memory from the OS, you are allocating a page
(4k). Most likely, you may not need a whole page for the data you are
allocating, so the GC gives you more finely sized chunks by breaking up
a page into smaller pieces. This strategy works well in some cases,
and can be wasteful in others. The goal is to strike a balance that is
"good enough" for everyday programming, but can be specialized when you
need it.
That's understandable, and it makes sense that the actual memory being
allocated would correspond to some chunk size. It's really just opaque
black box behavior that poses a problem; if users are given well-defined
guidelines and chunk sizes, that would work just fine. For instance, a
spec like, "reserve a multiple of 512 bytes and that's exactly what you
will be given," would allow users to minimize wastefulness and know
precisely how much memory they're allocating.
I think in the interest of allowing innovative freedom, such requirements
should be left up to the GC implementor, not the spec or runtime. Anyone
who wants to closely control memory usage should just understand how the
GC they are using works.
If you want to control memory allocation yourself, you can always do
that by allocating page-sized chunks and doing the memory management on
those chunks yourself. I do something very similar in dcollections to
speed up allocation/destruction.
<snip>
I think D has deterministic allocation, and better ability than C++ to
make custom types that look and act like builtins. Therefore, you can
make an array type that suits your needs and is almost exactly the same
syntax as a builtin array (except for some things reserved for
builtins, like literals). Such a thing is certainly possible, even
with using the GC for your allocation.
That parallels what game devs do in C++: They tend to use custom
allocators a lot, and they're likely to follow the same basic strategy
in D too, if/when it becomes a suitable replacement. I'm still just
browsing though, and I'm not all that familiar with D. If you can't
actually use the built-in dynamic arrays for this purpose, how difficult
would it be to reimplement a contiguously stored dynamic container using
custom allocation? I suppose you'd have to build it from the ground up
using a void pointer to a custom allocated block of memory, right? Do
user-defined types in D have any/many performance disadvantages compared
to built-ins?
No, you would most likely use templates, not void pointers. D's template
system is far advanced past C++, and I used it to implement my custom
allocators. It works great.
User-defined types are as high performance as builtins as long as the
compiler inlines properly.
BTW, I made the change to the runtime renaming the function previously
known as setCapacity to reserve. It won't be a property, even if that
bug is fixed.
-Steve
That's a bit of a downer, since a capacity property would have nice
symmetry with the length property. I suppose there were good reasons
though. Considering the name change, does that mean reserve can only
reserve new space, i.e. it can't free any that's already been allocated?
Capacity still exists as a read-only property. I did like the symmetry,
but the point was well taken that the act of setting the capacity was not
exact. It does mean that reserving space can only grow, not shrink. In
fact, the capacity property calls the same runtime function as reserve,
just passing 0 as the amount requested to get the currently reserved space.
You can't use capacity to free space because that could result in dangling
pointers. Freeing space is done through the delete keyword. We do not
want to make it easy to accidentally free space.
(That makes me wonder: Out of curiosity, how does the garbage
collector know how much space is allocated to a dynamic array or
especially to a void pointer? I suppose it's registered somewhere?)
The GC can figure out what page an interior pointer belongs to, and
therefore how much memory that block uses. There is a GC function to get
the block info of an interior pointer, which returns a struct that
contains the pointer to the block, the length of the block, and its flags
(whether it contains pointers or not). This function is what the array
append feature uses to determine how much capacity can be used. I believe
this lookup is logarithmic in complexity.
-Steve