On 30/09/2013 9:48 PM, david j wrote:
On Mon, Sep 30, 2013 at 5:33 PM, Sandro Magi <[email protected] <mailto:[email protected]>> wrote:A properly implemented language founded on CR for mutable state could totally achieve within 10% IMO.I can see how this is true if all your versioned elements are scalars or reference types, but I don't see how this could possibly be true in the general case of any larger sequential data-structure.For example, consider looping over a large value type array. The non-versioned value type array is going to be large cache-efficient sequential memory access. If we have to version writes to this data, we have to decide between either (a) expensively copying the entire array because we changed one entry, or (b) dealing with indirection overhead for every value of the array (which is going to cost alot more than 10% vs a sequential scan).Even a large class-object of scalars turned into versioned scalars has disasterously comparable cache effects if you access many fields of the object.
1. I don't think they're nearly as disastrous as you're implying, for the simple reason that the Versioned<T> instance is allocated right after its containing class, which means it's very likely to be in the same cache line. Indirection is then comparable to a Brooks-style read barrier when the forwarding pointer is pointing to itself, which has been empirically validated at less than 10%.
2. You're assuming a Versioned<T>[], which permits efficient update of individual cells, but if you're more frequently doing batch updates over the whole array, you're better off doing Versioned<T[]>, which aggregates all the individual cloning ops into a single cloning op.
A language based on CR can be even more aggressive, since 1. it can identify individual points where mutation occurs and thus avoid full clones, and 2. it can allocate value type Versioned<T> instances as you suggest below.
Do you see what I was getting at now?I've been considering whether it's reasonable to build a value-type implementation of the versions container. Of course it would have a fixed-size number of in-place slots after which it would need to use a pointer to 'overflow', but I think this would be better than adding a pointer indirection to every value type.
Yes it's possible in some cases [1], but it must be fully encapsulated and managing the lifetime of the thread-local slot is pushed up to the slot container, instead of being provided by the slot itself.
Sandro[1] Basically, anywhere the CLR allows a private mutable field will permit a value type VersionedField<T>, but the instance must never escape the container, and the container must implement a finalizer which returns the slot to the pool (basically reproducing the finalization work of the Versioned<T> class type). I've implemented this in a private prototype since the same thread-local logic and lifetime management applies to instance-specific ThreadLocal<T> variables, so I plan to reuse it in a few different contexts.
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ bitc-dev mailing list [email protected] http://www.coyotos.org/mailman/listinfo/bitc-dev
