On Wed, Aug 7, 2013 at 10:08 AM, Bennie Kloosteman <[email protected]>wrote:

>
> On Wed, Aug 7, 2013 at 10:40 PM, Jonathan S. Shapiro <[email protected]>wrote:
>
>> On Tue, Aug 6, 2013 at 8:50 PM, Bennie Kloosteman <[email protected]>wrote:
>>
>>> Why a sub heap and not a seperate heap  ? Or is it because its still
>>> under the GCs nominal control ?
>>>
>>
>> At the time I wrote that, I was considering this as a problem of how to
>> subset the GC heap. In that frame, thinking of it as a sub-heap seemed to
>> make sense.
>>
>> Ultimately, there are no separate heaps. At a certain level, all objects
>> live in some portion of the heap and need sufficiently compatible semantics
>> for a single set of pointer types to make sense. The minute you can have a
>> pointer from one "separate heap" to another you have a single heap again.
>>
>
> Q. If region analysis shows  no ref , you can put it in a sub  heap ..
> what happens when the region scope finishes and you can set such pointers
> are you envisaging the sub heap just automatically becomes part of the
> whole GC or do you move / copy such objects .
>

I'm afraid that I'm not making sense of the question. How would region
analysis "show no ref"? And that's not the criterion for a sub-heap in any
case.

Let me try again to say what I mean by a sub-heap.

We have a directed graph of objects. Most of its references are intra-graph
references. The in-degree of this graph is small, meaning that there are
only a few pointers out there that point to something inside the graph.
When those few pointers go away, the entire graph is collectable. While
they continue to exist, the graph is *not* collectible. For the moment, I'm
talking about a graph that is *not* churning.

If we somehow can know that these objects have this sort of low in-degree
association, then (a) we can avoid tracing the graph, or (b) we can take
tracing the graph as a separable, low-frequency, incremental problem. In
either case we offload work from the problem of GCing the *general* heap.
In effect, these objects exist in a separate GC domain.

Conceptually, what I mean by a sub-heap is an explicit region that is a
child region of the general GC heap. All of the objects in our graph are
allocated from this explicit region. We distinguish (by unspecified means,
but I think I know how) between intra-region pointers and into-region
pointers, tracing only the into-region pointers to see whether the region
is still live.

In effect, what I'm trying to do is introduce an annotation at allocation
time that says "here are a bunch of objects that are all going to live or
die as a group. I'm telling you this by virtue of allocating them from an
explicitly identified region, now let's use that knowledge to do something
sensible"

Now as an implementation matter, this isn't just a labeling trick. The
objects allocated from that region can be allocated from distinct arena
chunks that are not traced by the tenured collection process. That's the
part that offloads effort from the GC.


>
>
>>
>>
>>>  **
>>> Why do we need ARC at all in the automatic system , since It can be
>>>  slow for the non single threaded cases. ?
>>>
>>
>> I raised ARC in the context of trying to avoid mark/sweep overheads on
>> sub-heaps. If you aren't marking in a sub-heap, you need something that
>> lets you know when to release. ARC is one possibility. This is one of
>> several possible approaches to reducing the tenured churn problem.
>>
>
> Can region analysis show  how often references will be counted and make a
> performance cost here ?
>

No. That's not what region analysis does. You *really* need to go read the
Tofte paper and learn what region analysis is. I'm going to skip most of
your following questions, because they make no sense to me.


>  What would you use for lots of string work  , strings are very likely to
> be the main use of a sub heap ( no ref to anything else) ?
>

Strings (and more generally, reference-free types) are indeed an important
special case, and are very commonly managed in real runtimes by allocating
their payload from non-traced arena chunks. This can be done with a pretty
simple hack based solely on the per-object type information. It is
generally done only for reference-free objects that are *above* some
threshold size, and mainly done because the cost of copying/compacting for
such objects is high and has very little payoff. No region analysis is
required, so we don't usually think of this as a region-based allocation
problem.

But your intuition that we *could* think of it that way is correct. We
could do type-based regions, and a string region can be seen as an example
of this.

...i think the .NET GC has a seperate pool for strings as it does not need
> to mark them...
>

The CLR implementation of strings is the same as the array implementation.
The main point is that the array object header and its payload are stored
separately. If the array body is of a reference-free type, then it is
stored in a reference-free arena. The object header for the array is marked
and swept, but that object header contains the *sole* pointer to the array
payload, so the payload pointer doesn't need to be traced.

This implementation isn't at all new. It predates CLR by at least two and
maybe three decades.


> Not many type safe  languages that allow these sort of things done in a
>>> lib ..
>>>
>>
>> Who said anything about doing this stuff in a library?
>>
>
> You did :-)    "Some of them can be built as libraries. "  They all look
> pretty hard to put in a lib ,  the freeze / mutability could be a lib but
> its nice for the system to know mutability .
>

Provided you have decent support for inlining and optimization, and you
also have linear types, ARC can be done in a library.


shap
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to