Re: [bitc-dev] Runtime issues with unboxed arrays

Jonathan S. Shapiro Sat, 26 Oct 2013 10:42:43 -0700

On Sat, Oct 26, 2013 at 4:26 AM, Ben Kloosterman <[email protected]> wrote:


> I dont think object size had anything to do with it  eg embed array vs a
> slighty smaller object and a seperate large array object and vlarge objects
> end up in the large heap so dont get relocated.. 4.5 added support for
> objects bigger than 2G ( eg Array objects are still huge) .
>

In general, I agree that object size isn't the issue - there are lots of
ways to mark a large object in the heap incrementally.

A large object on the stack, however, has to be marked as part of nursery
collection. You can't evade or delay that, so it's a potentially serious
source of pause time complaints.


>
> I think the main reason is the way you use them , you need to pin them
> which gives a pointer...
>

The original "unboxed arrays" that didn't make it into .Net 1 would not
have needed to be pinned. They were an unboxed sequence type.

My source that this was intended and dropped is someone who would know, but
I don't recall him ever saying *why* they dropped it. My impression at the
time was that it was viewed as lower priority than a bunch of other things,
got left out for lack of time, and then somebody came up with the
unsafe/fixed solution and so nobody felt a need to go back and revisit it.
I don't think it's productive to speculate, since we're never going to
know. I suppose I could ask, but it's not something we really need to know,
is it.

The use-cases of interest, pragmatically, don't involve large arrays. Once
the array gets above a certain length, there's no real advantage to putting
it on the heap - especially with hybrid-RC collection - unless you're
trying to match some sort of externally imposed object layout (like a
network packet). But those tend to want to be on the heap anyway.


> By the time of v2 it was clear that .NET C# was not a native replacement
> and would take a lot more than this to make it so , they did add fixed
> marshalling which seems to  interop with native objects which have embedded
> arrays.
>

Fixed marshaling certainly helps, but I don't buy the inner reference
issue. Objects on the stack aren't relocatable, so inner references aren't
that big a problem as long as they are dominated by a known-live reference.
In this case, the known-live reference is the frame pointer, which is not
assignable, and you're done.

The more complex case is when arrays appear within heap objects, and the
array element type contains references. In that case you're going to need
to generate a /this/pointer into the array interior, which is an inner
reference. There are a variety of ways to handle this particular case
without going to fat pointers, but it's a nuisance. Also, you can't always
rely on a dominant live reference unless the compiler is really careful.
Consider:

   for(int i = 0; i < ar.Size(); i++) {
      var inner = ar[i];
      (void) inner.MemberCall(...);
      ...
   }

Turns out this isn't safe, because a concurrently executing method can
perform a store to ar. This changes the array you are operating on, and
potentially it's length (which is why the i < ar.Size() can't be hoist in
CLR). So then you need to make a local, temporary on-stack copy of the
reference to ar, but if the compiler introduces that it changes the
computation semantics.

There are lots of concurrency hazards like this hiding in CLR and JVM that
preclude obvious optimizations. Wonder of wonders, we've learned some
things in the last decade.


> re the precise GC reference vector Is it worth  saying that any array of
> unboxed objects must have objects that have no references as the use cases
> rarely require this...
>

That restriction exists in C# vectors already. It's a hack. My view is:
don't compromise the type system for the convenience of runtime encoding.
The fewer special cases you have, the happier you're going to be.

There are a bunch of DPF-like mechanisms we can use for this. The general
idea is that the bitmap becomes an instruction stream in a "marker
language". The opcodes are:

    MARK-WORDS N <bitmap>
    ITERATE N <instructions> CONTINUE
    MARK-OBJECT   // mark object pointed to by the GC cursor using its
marking program

That ought to be about all you need.

How do you pass references to this embedded array around  ?
>

In CLR, you can't generate inner references, so you don't. Except, yeah, if
the objects have methods then their this pointer can leak on call, so it's
a hairball. But that's equally a problem with heap-allocated vectors. Which
may explain why the vector element type must be either a single reference
or a reference-free value type.

For heap-allocated vectors it's not a big deal for large object space. It's
smaller vectors that are a mess.

BTW both Rust and go have replaced segmented stacks ,  Go now does a global
> pause , copies in a new stack and adjusts pointers to stack .
>

Replaced them with what? (misplaced referent). Do you mean that they have
moved *to* segmented stacks, or *from* segmented stacks?


shap

_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Re: [bitc-dev] Runtime issues with unboxed arrays

Reply via email to