On Sun, Oct 27, 2013 at 1:41 AM, Jonathan S. Shapiro <[email protected]>wrote:
> On Sat, Oct 26, 2013 at 4:26 AM, Ben Kloosterman <[email protected]>wrote: > >> I dont think object size had anything to do with it eg embed array vs a >> slighty smaller object and a seperate large array object and vlarge objects >> end up in the large heap so dont get relocated.. 4.5 added support for >> objects bigger than 2G ( eg Array objects are still huge) . >> > > In general, I agree that object size isn't the issue - there are lots of > ways to mark a large object in the heap incrementally. > > A large object on the stack, however, has to be marked as part of nursery > collection. You can't evade or delay that, so it's a potentially serious > source of pause time complaints. > I dont think this is an issue -Only value types containing unboxed array are an issue Value types are passed by value so unlikely to be big .. - If too big you must autobox .. , stack overflow is also an issue. - Only advanced programmers would use this and they would know the issue. > > >> >> I think the main reason is the way you use them , you need to pin them >> which gives a pointer... >> > > The original "unboxed arrays" that didn't make it into .Net 1 would not > have needed to be pinned. They were an unboxed sequence type. > > My source that this was intended and dropped is someone who would know, > but I don't recall him ever saying *why* they dropped it. My impression > at the time was that it was viewed as lower priority than a bunch of other > things, got left out for lack of time, and then somebody came up with the > unsafe/fixed solution and so nobody felt a need to go back and revisit it. > I don't think it's productive to speculate, since we're never going to > know. I suppose I could ask, but it's not something we really need to know, > is it. > Yep ask :-) And im sure the authors of Rc-Immix and URC can answer some questions as well im sure they would be interested that you are considering using their work. > > The use-cases of interest, pragmatically, don't involve large arrays. Once > the array gets above a certain length, there's no real advantage to putting > it on the heap - especially with hybrid-RC collection - unless you're > trying to match some sort of externally imposed object layout (like a > network packet). But those tend to want to be on the heap anyway. > Yes they would all go to the large object heap anyway. > >> >> The more complex case is when arrays appear within heap objects, and the > array element type contains references. In that case you're going to need > to generate a /this/pointer into the array interior, which is an inner > reference. There are a variety of ways to handle this particular case > without going to fat pointers, but it's a nuisance. Also, you can't always > rely on a dominant live reference unless the compiler is really careful. > Consider: > > for(int i = 0; i < ar.Size(); i++) { > var inner = ar[i]; > (void) inner.MemberCall(...); > ... > } > > > Turns out this isn't safe, because a concurrently executing method can > perform a store to ar. This changes the array you are operating on, and > potentially it's length (which is why the i < ar.Size() can't be hoist in > CLR). > I was thinking unboxed arrays are always fixed length .. ( but they can set a member to null) . Variable size embedded arrays are way to hard. Your probably refering to arrays up to a size embedded in the object but even in that case i would throw it back to the user to manage the size. ( Maybe a standard lib object can manage it later) So then you need to make a local, temporary on-stack copy of the reference > to ar, but if the compiler introduces that it changes the computation > semantics. > similar things happen all over the place eg for(int i = 0; i < obj.arrayList.Count(); i++) { var inner = obj.arrayList[i]; (void) inner.MemberCall(...); ... } // other thread sets obj.arrayList to null or any member to null.. This is why foreach is much better. ( though no idea how you can throw the collection modified exception on unboxed arrays and slices ) . All over .NET there are comments that itterating over a collection that is modified is not thread safe. > > There are lots of concurrency hazards like this hiding in CLR and JVM that > preclude obvious optimizations. Wonder of wonders, we've learned some > things in the last decade. > Yep... That why im against a concurrency safety as a general case , its a false promise it doesnt help in many cases you still need to know what your doing and when newer programers do it , things still go pop.. 90% of user safety is a lock so the code is single threaded. We could do better in where we place the lock. People who write lockless code are good enough to know the issues. That said there should be a simple list of what works and some safe constructs that always work and that affect the compiler concurrent_foreach , lock ( this) , slimlock(this) etc .. > > >> re the precise GC reference vector Is it worth saying that any array of >> unboxed objects must have objects that have no references as the use cases >> rarely require this... >> > > That restriction exists in C# vectors already. It's a hack. My view is: > don't compromise the type system for the convenience of runtime encoding. > The fewer special cases you have, the happier you're going to be. > .NET only allows native types ( and is a hack) which is not good enough i think allowing user structs like Points , Matrixes as well as references that dont have further references like string is fine. The condition is you need a DoesNotContainReferences type flag but we need this for the GC anyway rather than just treat string as a special case to pick up half of them. Re convenience of the runtime Its true but on the other side of the coin if no one needs it why build support you can build V0.5 without and if people complain and want it then add it. It would not be a huge compile rchange to remove a restriction. Note in the case of allowing references in the unboxed vector then the mark must still walk the array but not the objects it points too. But this can go in the type reference vector just like they were members so its very easy. References holding references should even be ok ( but it has the wrong feel to me) , However if value objects hold further references it gets more tricky and the precise GC reference lookup vector ( and objects) will be a tree and could get pretty big (It is possible since structs cant have cycles , unboxed vectors need the same limitation. ) . Restricting it would prevent more stupid things ( super large objects etc) than allow smart things. > > There are a bunch of DPF-like mechanisms we can use for this. The general > idea is that the bitmap becomes an instruction stream in a "marker > language". The opcodes are: > > MARK-WORDS N <bitmap> > ITERATE N <instructions> CONTINUE > MARK-OBJECT // mark object pointed to by the GC cursor using its > marking program > > That ought to be about all you need. > Dont you need recursion to handle a tree of diffirent sized value types some of which have references ? > How do you pass references to this embedded array around ? >> > > In CLR, you can't generate inner references, so you don't. Except, yeah, > if the objects have methods then their this pointer can leak on call, so > it's a hairball. But that's equally a problem with heap-allocated vectors. > Which may explain why the vector element type must be either a single > reference or a reference-free value type. > > > Its kind of closed in the design , you can take a ref of a value type ( ref paramter on a function) which is an internal pointer passed on the stack but because its a value type you cant store it as it will be copied.. To me this just hangs together ( and does fail as you stated at the this pointer). Ben
_______________________________________________ bitc-dev mailing list [email protected] http://www.coyotos.org/mailman/listinfo/bitc-dev
