On Sat, Nov 9, 2013 at 3:37 AM, Ben Kloosterman <[email protected]> wrote:

> Second, there is nothing magical about a particular fixed-size immix
>> block. We can generalize the notion to any 2^k byte block above some
>> minimum size. The only reason we want to know the size statically is that
>> it reduces the number of instructions in the bump allocator fast path.
>>
>
> At first i thought with 2^k you would suffer buddy equivalent
> fragementation but this is not the case since you can request the hole
>  space . A bit cumbersome though so you would try to keep size common.
>

Initially I had a similar concern, but I don't think it's the case. This is
why the VA range allocator matters.

The missing piece in the RC-immix paper is that the storage has to *come* from
somewhere. The stacks, the immix blocks, the large object pages, and the
nursery (if that isn't just immix blocks) all get allocated as contiguous
ranges of 2^k >= pagesize chunks that are 2^k aligned up to some k. It
isn't quite as bad as a buddy allocator, because only a few sizes are in
use in practice.

In contrast to a buddy allocator, however, this space can mostly be
defragmented. The stacks *may* (not always) be non-relocatable because of C
frames, but all of the other blocks can be evacuated over time. And we need
to do that anyway to avoid fragmentation of the *virtual* address space in
the LOS. This layer is also the layer at which real pages get returned to
the underlying operating system.

The bump allocator is not a big deal  because your counting lines the only
> diffirence is the # of lines but you only hit that at the end of each
> line...
>

Yes, though the bump allocator does need to know the block size so that it
can find the metadata bytes in order to update the counts.


> ..( And you need lines to find objects quickly)  , another option is when
> you load the block you load an allocator set to that block size ( if
> holding a register or variable for size is too big a cost) .
>

I suspect that's too much state for the inline portion of the bump
allocator to use effectively.

So how does this relate to the stack? You wrote:
>>
>>
>>>
>>> GC header
>> vtable ptr
>>
>>
> Why do we need a GC header , every algo. works without a GC header on the
> stack ?
>

There are definitely other ways you can do this. You don't *need* a GC
header. And it might be better to use a PC-indexed stack map, which is the
usual implementation. Mostly, it seemed to me that it would be nice to have
some economy of mechanism here. Because they are ordered by PC rather than
by SP, a stack map is not a fast thing to search.

The question, I think, is whether it's cheaper to push a word on every call
or cheaper to use a stack map. That word, by the way, could also encode
which call-clobbered registers contained objects at the point of call.


> Speaking of which i think the Vtable pointer can be a 32 bit pointer 1G is
> a LOT of type data.
>

Possibly. And yes, this would usefully reduce the object header size.


> But mainly, what I was trying to say was that stacks are allocated from
>> blocks, and there are ways we can exploit the immix-style block
>> organization that are useful for stack marking.
>>
>>
> Yes Its a convenient organization for anything that wants to scan for
> objects.
>

And it can be made to work on the C heap as well, with care.

Yes i think its better provided the header and line/metadata cost  is not
> high . For a tightloop of calls you dont want to put down 2 words each
> time.   Some things we can do besides reduce the header is
> - In a loop to the same calls we can re-use the same stack frame  ( in a
> similar manner to callee slots if that makes sense) .. eg one thing is
> you can simply check if vtable pointer is the same in which case the stack
> frame is the same and we can just proceed to puting the variables in the
> slots.
>

That can usually be statically determined.


> - If there is no references then there is no header.  Such stack frames
> could not be easily "found" but they should not need to be and would incur
> zero cost.
>

That's harder, because it makes the calling convention non-uniform. You're
basically asking for an effect type on every procedure of the form "does
something with references". If you are generating this code from a JIT
that's easy enough to do in most cases, but I wouldn't want to do it from a
static compiler.


shap
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to