On 30/09/2013 11:19 AM, david j wrote:
Sandro, thanks for posting those links [3,4]. I have to agree with you, this is the most promising abstraction for structuring fine grained parallelism I have seen to date.I'm suspicious or may be very hard to keep the overhead near their observed 5% for less trivial data-structures, bit it is worth a shot!
A properly implemented language founded on CR for mutable state could totally achieve within 10% IMO. Since CR variables are implemented via thread-local arrays, the overhead CR adds to reads and writes is a pointer offset + an additional pointer indirection (which adds 5-10% overhead last I checked up on Brooks-style read barriers); keep a pointer to the block of thread-local storage in a register to reduce overhead as much as possible, and copy-on-write optimized by the compiler only when needed, and 5-10% seems reasonable.
A library approach to CR is obviously less efficient, but the CLR on which they're conducting their experiments retains runtime types, so thread-local state is also type-indexed, allowing it to be much more efficient once code generation stabilizes. IIRC, their 5% figure consists mostly of variables of atomic/immutable types, like int32, float32, etc. so cloning overheads are low. However, you can efficiently albeit conservatively check at runtime whether a type is immutable [1], so if you ensure your objects are largely immutable, or you ensure that all mutable state is encapsulated in Versioned<T> instances, then that overhead could also be relatively small.
The LVars paper I just posted about is also an interesting point in the design space that I'm still looking into.
Sandro[1] http://sourceforge.net/p/sasa/code/ci/default/tree/Sasa.Dynamics/Type.cs#l79
On Sep 30, 2013 5:07 AM, "William ML Leslie" <[email protected] <mailto:[email protected]>> wrote:> Nevertheless, what Carmack seems to be describing is GC (with card > tables, I guess).I don't think so. I think he is describing a two-revision GC integrated version of [3], where the indirection cost is removed by replacing their COW and version lookup with a full copy integrated with the GC relocate.[3] http://research.microsoft.com/apps/pubs/default.aspx?id=132619 [4] http://research.microsoft.com/apps/pubs/default.aspx?id=150180 > He wants more control, but like many, doesn't > quite seem to grasp what that control needs to be.That is not what I see. I see a very specific desire to synchronize GC sweeps with frame rendering - letting standard gc handle lifetime. (No static lifetime analysis)However, I suspect [3] may be dramatically more efficient, capable, and flexible ... Just so long as GC stop the world is sufficiently minimized._______________________________________________ bitc-dev mailing list [email protected] http://www.coyotos.org/mailman/listinfo/bitc-dev
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ bitc-dev mailing list [email protected] http://www.coyotos.org/mailman/listinfo/bitc-dev
