Re: [bitc-dev] Non copying Rc-Immix was Putting the stack on an RC-Immix heap

Jonathan S. Shapiro Mon, 18 Nov 2013 07:42:14 -0800

On Mon, Nov 18, 2013 at 5:57 AM, Ben Kloosterman <[email protected]> wrote:

> On Mon, Nov 18, 2013 at 12:37 AM, Jonathan S. Shapiro <[email protected]>wrote:
>
>> I just realized that the immix papers don't mention one of the big
>> advantages to bump allocations: when you are allocating multiple objects in
>> sequence, the limit checks can be consolidated into a single check. It
>> seems to me that this is not true for immix, because the occupancy count on
>> each line makes the consolidation difficult. If things can't be
>> consolidated, that's a pretty significant fast-path overhead. Does anybody
>> know if they are consolidating successfully?
>>
>
> I think there is a runtime contraint where the runtime presents 1 call to
> alloc at a time.
>

Right. But if you know you are going to allocate an 8 word object followed
by a 12 word object, you can just do a single 24 word (allowing for
headers) allocation. Given the inline sequence you sent out, a conventional
compiler would do this optimization more or less automatically, except that
the procedure calls to the long path are an impediment.

> I see no reason why if the runtime/compiler supports it you cant go.
>
> var ptr= tryallocatecontiguious ( size_of_multiple_objects.)
> if ( ptr == NULL)
>      // allocate each object individually
> //construct the objects
>
> Though it does add to the inline path.
>

You wouldn't do it conditionally. You'd just do it. You'd probably end up
allocating more chunks out of the "free" block, but that's not necessarily
bad.

> I think ( i checked but im not 100%) that the occupancy count is done
> durring the collect phase (when processing newroots )  .
>

(slaps head) yes, of course. And that would make all the difference. If the
object counts are handled during nursery collection, dynamically sized
blocks aren't a problem at all.

Note the metadata for lines is located at the start of the Chunk not the
> block ( i think all metadata except the object & object GC header is in the
> chunk and hence a block is just unmarked objects)
>

That's strange, if only because it contradicts the paper.

> Regarding the stack, there is another issue that I hadn't considered: if
>> we use an immix block of some form for the stack, we can't rely on the
>> collector to zero it in the background. We would have needed to zero
>> reference slots anyway, so it's not like this is a new cost, but the
>> stack-of-objects approach may change how eagerly we need to do that.
>> Actually, I don't think it does, but it needs looking at.
>>
>
> I take it we cant rely on the collector to do this because collector
> threads  have immix block for a stack .
>

No no. We'll get an initially zero block just fine. The problem is that the
stack pointer is going up and down within the block. As procedures return,
they leave dirty lines "below" the stack. This happens *way* too rapidly
for background zeroing to work.

Maybe I haven't been clear enough. I'm not imagining that we allocate a
frame as a nursery object. I'm imagining that we manage a *conventional* stack
in an immix block, using the bump limit pointer to guard the end of the
block so we know when to create a new stack block/segment. All I was doing
was laying out the frames in a way that made the stack walkable by the
conventional object mark/scan logic.

One huge issue is block availability...
>

Right. Except now that I've said more clearly that I'm not allocating stack
frames as heap objects, hopefully it's clearer why I don't think that's an
issue.

> Normal Rc-Immix  cycle is   Allocate all new blocks ,Allocate all recycled
> blocks , Run Collect.  We probably dont want a stack to use Recycled blocks
> so we need some sort of reserve here.
>

For what I have in mind, the stack *definitely* can't use recycled blocks.

> Writing a good self-scaling hash table  is not easy..
>

True. But in this case, doubling the hash table size each time is a good
and simple approach.

> I kind of like the idea of writing a reference to static meta data
> regardless of the method since
> - You want the writing to be as fast as possible and the parsing is less
> important .
> - It has constant time regardless of the amount of variables.
>

Sure, but the stack map is effectively the same. It's using the same actual
map data, but it's statically precomputed and doesn't require any writes at
run time at all.

shap

_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Re: [bitc-dev] Non copying Rc-Immix was Putting the stack on an RC-Immix heap

Reply via email to