Re: GC.sizeOf(array.ptr)

Dicebot via Digitalmars-d Tue, 30 Sep 2014 09:05:59 -0700

On Tuesday, 30 September 2014 at 15:46:54 UTC, StevenSchveighoffer wrote:

On 9/30/14 10:24 AM, Dicebot wrote:
On Tuesday, 30 September 2014 at 14:01:17 UTC, StevenSchveighoffer wrote:
Assertion passes with D1/Tango runtime but fails withcurrent D2runtime. This happens because `result.ptr` is not actually apointerreturned by gc_qalloc from array reallocation, but interiorpointer 16bytes from the start of that block. Druntime stores somemetadata
(length/capacity I presume) in the very beginning.
This is accurate, it stores the "used" size of the array. Butit's
only the case for arrays, not general GC.malloc blocks.
Alternative is to use result.capacity, which essentiallylooks up thesame thing (and should be more accurate). But it doesn'tcover the
same inputs.
Why is it stored in the beginning and not in the end of theblock (likecapacity)? I'd like to explore options of removing interiorpointercompletely before proceeding with adding more special cases toGC
functions.
First, it is the capacity. It's just that the capacity lives atthe beginning of larger blocks.
The reason is due to the ability to extend pages.
With smaller blocks (2048 bytes or less), the page is dividedinto equal portions, and those can NEVER be extended. Anyattempt to extend results in a realloc into another block.Putting the capacity at the end makes sense for 2 reasons: 1. 1byte is already reserved to prevent cross-block pointers, 2. Itdoesn't cause alignment issues. We can't very well offset a 16byte block by 16 bytes. But importantly, the capacity fielddoes not move.
However, for page and above size (4096+ bytes), the original(D1 and early D2) runtime would attempt to extend into the nextpage, without moving the data. Thus we save the copy of datainto a new block, and just set some bits and we're done.

Ah that must be what confused me - I looked at small block offsetcalculation originally and blindly assumed same logic for othersizes. Sorry, my fault!

But this poses a problem for when the capacity field is storedat the end -- especially since we are caching the block info.The block info can change with a call to GC.extend (whereas afixed-size block, the block info CANNOT change). Depending onwhat "version" of the block info you have, the "end" can bedifferent, and you may end up corrupting data. This isespecially important for shared or immutable array blocks,where multiple threads could be appending at the same time.
So I made the call to put it at the beginning of the block,which obviously doesn't change, and offset everything by 16bytes to maintain alignment.
It may very well be that we can put it at the end of the blockinstead, and you can probably do so without much effort in theruntime (everything uses CTFE functions to calculate paddingand location of the capacity). It has been such a long timesince I did that, I'm not very sure of all the reasons not todo it. A look through the mailing list archives might be useful.

I think it should be possible. That way actual block size will besimply considered a bit smaller and extending happen beforereserved space is hit. But of course I have only a very vagueknowledge of druntime ackquired while porting cdgc so may need tothink about it a bit more and probably chat with Leandro too :)

Have created bugzilla issue for now :https://issues.dlang.org/show_bug.cgi?id=13558

Re: GC.sizeOf(array.ptr)

Reply via email to