Re: [Haskell-cafe] Structural sharing in haskell data structures?

Jan-Willem Maessen Thu, 14 May 2009 06:04:07 -0700


On May 13, 2009, at 6:58 PM, wren ng thornton wrote:

Jan-Willem Maessen wrote:
I wanted to clear up one misconception here...
wren ng thornton wrote:
> In heavily GCed languages like Haskell allocation and collectionis > cheap, so we don't mind too much; but in Java and the like,both > allocation and collection are expensive so the idea of cheapthrowaway > objects is foreign.
Not true!
I was speaking of Java, not Clojure. I believe the costs in Java arewell documented, though I don't know enough about the JVM to knowwhere the blame belongs. (All I know of Clojure is that it's a Lisp-like on the JVM :)

I think you're missing the point here: the code I refer to below *isin Java* and is running on a standard JVM; the "costs" you refer tosimply don't exist! As Vladimir Ivanov points out, and as Rich Hickeyis happy to observe in his talks on Clojure, the JVM handlesallocation-intensive garbage-intensive programs very well.

If you look at the internals of Clojure, you'll discover they'reusing trees with *very* wide fanout (eg fanout-64 leaf trees forlists). Why? Because it's so cheap to allocate and GC thesestructures! By using shallow-but-wide trees we reduce the cost ofindexing and accessing list elements. I suspect you'd still behard-pressed to support this kind of allocation behavior in any ofthe present Haskell implementations, and Haskell implementations ofthe same kinds of structures have limited fanout to 2-4 elements orso.
I was under the impression that the reason datastructures in Haskelltend to be limited to 4-fanout had more to do with the cleanlinessof the implementations--- pattern matching on 64-wide cells is quiteugly, as is dealing with the proliferation of corner cases forcomplex structures like finger trees, patricia trees, etc. The useof view patterns could clean this up significantly. On the otherhand, we do have things like lazy ByteStrings and UVector which dohave wide fanouts.

Hmm, I think neither of the data structures you name actually supportboth O(lg n) indexing and O(lg n) cons or append. That said, yourpoint is well taken, so let's instead state it as a challenge:

Can you, oh Haskellers, implement a fast, wide-fanout (say >= 8) tree-based sequence implementation in Haskell, which supports at-least-log-time indexing and at-least-log-time cons with a large base for thelogarithm? Can you do it without turning off array bounds checking(either by using unsafe operations or low-level peeking and poking)and without using an algebraic data type with O(f) constructors forfanout of f? You can turn off bounds checks if your program encodesstatic guarantees that indices cannot be out of bounds (there are acouple of libraries to do this).

The spirit here is "Work in Haskell with safe operations and no FFIexcept through safe libraries, but otherwise use any extensions youlike."

I actually think this *is* doable, but it touches a few areas whereHaskell doesn't presently do well (the bounds checking in particularis a challenge). I threw in the bounds checking when I realized thatin fact the equivalent Java code is always bounds checked, and thesebounds checks are then optimized away where possible. Actually, I'd*love* to see an *in*efficient solution to eliminating as many boundschecks as possible!


-Jan



--
Live well,
~wren
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Structural sharing in haskell data structures?

Reply via email to