On Thu, Oct 17, 2013 at 10:06 AM, Bennie Kloosteman <[email protected]>wrote:

> re S bits looks  solid only issues ( and they both should not be) are we
> sure only the last block in a line straddles ? ( seems likely but you never
> know).
>

Yes. Lines are empty when allocated, and are filled by increasing
addresses. The last object (not block) in a line can overflow into the next
line. Once it does, the initial line is full and is marked S to indicate
overflow (S for "spill"). We don't need to track this in both directions,
because the marking algorithm is working from the object address rather
than the line address.


> There may also be a higher cost for getting the line data...
>

Yes. But the S bit is never consulted by the mutator. It is set at
allocation time by the bump allocator, and it is consumed by the background
collector while discovering and zeroing blocks of contiguous free lines.

The allocator has to run a "does it fit" check on the line in any case. The
hidden cost here is the cache line reference to update the per-line S bit.
But in a nursery allocator that line will be cache hot.


> Note that an object with an N bit set resides in thread-local storage; no
>> cross-CPU contention exists for such objects.
>>
>
> ???? So what happens when a reference to that object is passed to another
> thread.
>

It can't be. Objects in the nursery, by design and requirement, are thread
local until they are "graduated" into the general heap. So the answer to
your question is either that a collection is triggered or that the object
is initially allocated in the general heap.

You definitely don't want to forward these objects, because you'd have to
do a page invalidate on the nursery region until the forwarding completed. *
Definitely* not a good idea!

This isn't that unusual - objects allocated from the large object store
(LOS) are also allocated from the general heap. The reason general heap
allocation is avoided is to reduce contention and reference counting. That
doesn't have to be 100% effective to make a huge difference.


>
>
>> *New Object Allocation*
>>
>>    - *Question:* Does nursery allocation always proceed from an
>>    initially free block?
>>
>> It doesnt say so .. so it would follow normal allocation a free block and
> a recycled block ..
>

That's one possibility, but it would make the new object bump allocator
more complex, which seems unfortunate. I tend to suspect that their
strategy for nursery collection has room for improvement.

I think that a better question (on my part) would have been: does tracing
the nursery always cause the surviving objects in the nursery to be copied
into the general heap?

My *guess* is that surviving objects are unconditionally moved to the
general heap. If so, *some* of those objects will be very young, which will
lead to heap churn. If we believe the 10% survival heuristic, it might be
worth looking at hybrid collection strategies in the nursery. So here's the
thing:

   - The most recent (youngest) 10% of objects in the nursery are
   contiguous, at the top of the allocated region.
   - Run mark-copy to evacuate everything *below* that to the general heap.
   - Now either run mark-compact on the 10% or just implement a cyclic
   nursery bump allocator.

The point is to give those infant objects time to die young. Note that we
don't even need to mark-compact them. They're only taking up 10% of the
nursery space in any case, and we can just ignore their internal
fragmentation.

The counter-argument would be: you'll catch those objects when you run the
decrements on the non-live reference from the nursery to the general heap,
and you'll probably end up returning most of those lines to their
containing block at that point.

Except that if you implement a deferred dec-buf....

This is something that can probably be determined only by experiment.


>
>
>>
>>    - *Question*: Why isn't partial compaction being used here? I.e. a
>>    scheme in which up to *N* lines will be moved to the front of the
>>    nursery block, with the effect that the oldest lines trend toward the
>>    bottom and can be generationally eliminated? Or is this really what is
>>    meant by "compaction"?
>>
>>
> Sounds good and there may be an attractive scheme without any  stackmap /
> relocation though obviously this will wasted more memory but  they do
> recycle to lines in partial blocks .
>

If stuff is getting copied to the general heap, you need a stack map in any
case. You said this later.


> Im not sure why they didnt check a more standard Nursery  as discussed
> previously ,  they do note the better performance of the generational
> collector on smaller heaps (Stephen Blackburn was on both so he should
> know)  .  Probably added implimentation cost  and they were already good
> enough.
>

Graduate students working on big things are often encouraged to stop when
they have enough for a dissertation or a paper. Then they write something
in the paper like "well, we wanted to see how the thing we were focusing on
did without polluting the experiment with other techniques". Sound like
anything you may have seen in the RC-immix paper?

>From a research perspective, I actually think this was a very reasonable
"line in the sand" for this particular work. That doesn't mean that *we* need
to be constrained by it.


shap
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to