Re: Can assumeSafeAppend() grab more and more capacity?

Steven Schveighoffer via Digitalmars-d-learn Wed, 07 Jun 2017 13:01:17 -0700

On 6/7/17 3:56 AM, Biotronic wrote:

On Wednesday, 7 June 2017 at 05:43:06 UTC, ag0aep6g wrote:

[snip]


It seems to me this is a topic worthy of a more in-depth article. If
only I felt up to that. :p


Your understanding and explanation is excellent actually!


When you create a slice 'a' in D (with the current GC and druntime, at
least), what happens behind the scenes is the allocator chops off a
block of N bytes, where N is some number larger than
a.length*typeof(a[0]).sizeof. For an array of two ints, N is 16.
For good measure, let's make a copy 'b' of that slice (it will come in
handy later):

int[] a = [1, 2];
int[] b = a;

import std.stdio;
writeln(a.capacity);

writeln(b.capacity);


The capacity is 3. Intriguing, as a block of 16 bytes should be able to
hold 4 ints.

We can ask the GC for more info on this block:

import core.memory;
auto info = GC.query(a.ptr);
writefln("0x%x, %s, %s ", info.base, info.size, info.attr);

0x2211010, 16, 10


That's the pointer to the start of the block, the size of the block, and
various attributes (appendable, e.g.).
We can get the raw data of the block:

auto block = (cast(ubyte*)info.base)[0..info.size];
writeln(block);

[1, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 8]


We can see our 1 and 2 in there, and a curious 8 at the end. That's the
currently used data, in bytes. That's also the reason the capacity is 3,
not 4 - this info has to live somewhere. If we were to append another
element, and print the data again:

a ~= 3;
writeln(block);

[1, 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 12]


See how the last byte changed to a 12? That just so happens to be
a.length*int.sizeof.

To be more specific, for blocks <= 256 bytes, 1 byte is reserved at theend of the array to store the length. For blocks > 256 bytes and <= 2048bytes, 2 bytes are reserved at the end of the block to store the lengthof the array. For larger blocks, those are PAGE size and larger, andthey have a special feature. Such blocks are not limited to a power of2, and can be extended literally in-place by tacking on additionalPAGEs. I wanted to just put a size_t at the end, but my problem withthis is that the length then would move around as you appended or shrunkblocks. Given how the runtime works, it's possible that 2 threads couldbe potentially appending at the same time to a shared array, so Idecided to store it at the beginning of the block instead.

I would actually like to replace this mechanism with one that stores thelength outside the block and into a separate memory space, as it ishorrible for caches. Allocate a page of bytes, and you actually get 2pages - size_t.sizeof.

Note, for types with destructors, more bytes are reserved to store thetype info of the array elements. I didn't write that part, so I'm not100% sure how it works.

Now for a curious thing: what happens to a's capacity when we append to b?

b ~= 4;
writeln(a.capacity);


As above, the length of a in bytes equals the used bytes in the
allocated memory block, and so both a and b have capacity again.

Yes, for this reason, calling assumeSafeAppend is unsafe and can *never*be part of the normal treatment of arrays. It is on you, the programmer,to ensure that no references to the no-longer-allocated portion of thearray exist. The compiler can't ensure it, a library function can'tensure it, and they are similar to dangling pointers.

Only a library type which encapsulates the array completely can useassumeSafeAppend.

For example, imagine if these are immutable arrays, you have nowoverwritten immutable data!


-Steve

Re: Can assumeSafeAppend() grab more and more capacity?

Reply via email to