Re: [bitc-dev] Runtime issues with unboxed arrays

Ben Kloosterman Sat, 26 Oct 2013 23:13:02 -0700

On Sun, Oct 27, 2013 at 1:41 AM, Jonathan S. Shapiro <[email protected]>wrote:

> On Sat, Oct 26, 2013 at 4:26 AM, Ben Kloosterman <[email protected]>wrote:
>
>> I dont think object size had anything to do with it  eg embed array vs a
>> slighty smaller object and a seperate large array object and vlarge objects
>> end up in the large heap so dont get relocated.. 4.5 added support for
>> objects bigger than 2G ( eg Array objects are still huge) .
>>
>
> In general, I agree that object size isn't the issue - there are lots of
> ways to mark a large object in the heap incrementally.
>
> A large object on the stack, however, has to be marked as part of nursery
> collection. You can't evade or delay that, so it's a potentially serious
> source of pause time complaints.
>

I dont think this is an issue

-Only value types containing unboxed array  are an issue Value types are
passed by value so unlikely to be big ..
- If too big you must autobox .. , stack overflow is also an issue.
- Only advanced programmers would use this and they would know the issue.

>
>
>>
>> I think the main reason is the way you use them , you need to pin them
>> which gives a pointer...
>>
>
> The original "unboxed arrays" that didn't make it into .Net 1 would not
> have needed to be pinned. They were an unboxed sequence type.
>
> My source that this was intended and dropped is someone who would know,
> but I don't recall him ever saying *why* they dropped it. My impression
> at the time was that it was viewed as lower priority than a bunch of other
> things, got left out for lack of time, and then somebody came up with the
> unsafe/fixed solution and so nobody felt a need to go back and revisit it.
> I don't think it's productive to speculate, since we're never going to
> know. I suppose I could ask, but it's not something we really need to know,
> is it.
>

Yep ask :-)    And im sure the authors of Rc-Immix and URC can answer some
questions as well im sure they would be interested that you are considering
using their work.

>
> The use-cases of interest, pragmatically, don't involve large arrays. Once
> the array gets above a certain length, there's no real advantage to putting
> it on the heap - especially with hybrid-RC collection - unless you're
> trying to match some sort of externally imposed object layout (like a
> network packet). But those tend to want to be on the heap anyway.
>

Yes they would all go to the large object heap anyway.

>
>>
>> The more complex case is when arrays appear within heap objects, and the
> array element type contains references. In that case you're going to need
> to generate a /this/pointer into the array interior, which is an inner
> reference. There are a variety of ways to handle this particular case
> without going to fat pointers, but it's a nuisance. Also, you can't always
> rely on a dominant live reference unless the compiler is really careful.
> Consider:
>
>    for(int i = 0; i < ar.Size(); i++) {
>       var inner = ar[i];
>       (void) inner.MemberCall(...);
>       ...
>    }
>
>

> Turns out this isn't safe, because a concurrently executing method can
> perform a store to ar. This changes the array you are operating on, and
> potentially it's length (which is why the i < ar.Size() can't be hoist in
> CLR).
>

I was thinking unboxed  arrays are always fixed length  .. ( but they can
set a member to null) . Variable size embedded arrays are way to hard. Your
probably refering to arrays up to a size embedded in the object but even in
that case i would throw it back to the user to manage the size. ( Maybe a
standard lib object can manage it later)

So then you need to make a local, temporary on-stack copy of the reference
> to ar, but if the compiler introduces that it changes the computation
> semantics.
>

similar things happen all over the place eg

for(int i = 0; i < obj.arrayList.Count(); i++) {
      var inner = obj.arrayList[i];
      (void) inner.MemberCall(...);
      ...
   }

// other thread sets obj.arrayList to null or any member to null..

This is why foreach is much better.  ( though no idea how you can throw the
collection modified exception on unboxed arrays and slices ) . All over
.NET there are comments that itterating over a collection  that is modified
is not thread safe.

>
> There are lots of concurrency hazards like this hiding in CLR and JVM that
> preclude obvious optimizations. Wonder of wonders, we've learned some
> things in the last decade.
>

Yep...

That why im against a concurrency safety as a general case  , its a false
promise it doesnt help in many cases you still need to know what your doing
and when newer programers do it , things still go pop..
90% of user safety is a lock  so the code is single threaded. We could do
better in where we place the lock.
People who write lockless code are good enough to know the issues.

That said there should be a simple list of what works and some safe
constructs that always work and that affect the compiler concurrent_foreach
, lock ( this) , slimlock(this) etc ..

>
>
>> re the precise GC reference vector Is it worth  saying that any array of
>> unboxed objects must have objects that have no references as the use cases
>> rarely require this...
>>
>
> That restriction exists in C# vectors already. It's a hack. My view is:
> don't compromise the type system for the convenience of runtime encoding.
> The fewer special cases you have, the happier you're going to be.
>

.NET only allows native types ( and is a hack) which is not good enough i
think allowing user structs like Points , Matrixes as well as references
that dont have further references like string is fine. The condition is you
need a DoesNotContainReferences type flag  but we need this for the GC
anyway rather than just treat string as a special case to pick up half of
them.

Re convenience of the runtime Its true but on the other side of the coin if
no one needs it why build support you can build V0.5 without and if people
complain and want it then add it. It would not be a huge compile rchange to
remove a restriction.

Note in the case of allowing references in the unboxed vector then the
 mark must still walk the array but not  the objects it points too.  But
this can go in the type reference vector  just like they were members so
its very easy.   References holding references should even be ok ( but it
has the wrong feel to me)   , However if value objects  hold further
references  it gets more tricky and the precise GC reference lookup vector
( and objects) will be a tree and could get pretty big  (It is possible
since structs cant have cycles , unboxed vectors need the same limitation.
)  .

Restricting it would  prevent more stupid things ( super large objects etc)
than allow smart things.

>
> There are a bunch of DPF-like mechanisms we can use for this. The general
> idea is that the bitmap becomes an instruction stream in a "marker
> language". The opcodes are:
>
>     MARK-WORDS N <bitmap>
>     ITERATE N <instructions> CONTINUE
>     MARK-OBJECT   // mark object pointed to by the GC cursor using its
> marking program
>
> That ought to be about all you need.
>

Dont you need recursion to handle a tree of diffirent sized value types
some of which have references ?

> How do you pass references to this embedded array around  ?
>>
>
> In CLR, you can't generate inner references, so you don't. Except, yeah,
> if the objects have methods then their this pointer can leak on call, so
> it's a hairball. But that's equally a problem with heap-allocated vectors.
> Which may explain why the vector element type must be either a single
> reference or a reference-free value type.
>
>
> Its kind of closed in the design , you can take a ref of a value type (
ref  paramter on a function)  which is an internal pointer passed on the
stack but because its a value type you cant store it as it will be copied..
To me this just hangs together ( and  does fail as you stated at the this
pointer).

Ben

_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Re: [bitc-dev] Runtime issues with unboxed arrays

Reply via email to