On May 13, 2009, at 6:58 PM, wren ng thornton wrote:
Jan-Willem Maessen wrote:
I wanted to clear up one misconception here...
wren ng thornton wrote:
> In heavily GCed languages like Haskell allocation and collection
is > cheap, so we don't mind too much; but in Java and the like,
both > allocation and collection are expensive so the idea of cheap
throwaway > objects is foreign.
Not true!
I was speaking of Java, not Clojure. I believe the costs in Java are
well documented, though I don't know enough about the JVM to know
where the blame belongs. (All I know of Clojure is that it's a Lisp-
like on the JVM :)
I think you're missing the point here: the code I refer to below *is
in Java* and is running on a standard JVM; the "costs" you refer to
simply don't exist! As Vladimir Ivanov points out, and as Rich Hickey
is happy to observe in his talks on Clojure, the JVM handles
allocation-intensive garbage-intensive programs very well.
If you look at the internals of Clojure, you'll discover they're
using trees with *very* wide fanout (eg fanout-64 leaf trees for
lists). Why? Because it's so cheap to allocate and GC these
structures! By using shallow-but-wide trees we reduce the cost of
indexing and accessing list elements. I suspect you'd still be
hard-pressed to support this kind of allocation behavior in any of
the present Haskell implementations, and Haskell implementations of
the same kinds of structures have limited fanout to 2-4 elements or
so.
I was under the impression that the reason datastructures in Haskell
tend to be limited to 4-fanout had more to do with the cleanliness
of the implementations--- pattern matching on 64-wide cells is quite
ugly, as is dealing with the proliferation of corner cases for
complex structures like finger trees, patricia trees, etc. The use
of view patterns could clean this up significantly. On the other
hand, we do have things like lazy ByteStrings and UVector which do
have wide fanouts.
Hmm, I think neither of the data structures you name actually support
both O(lg n) indexing and O(lg n) cons or append. That said, your
point is well taken, so let's instead state it as a challenge:
Can you, oh Haskellers, implement a fast, wide-fanout (say >= 8) tree-
based sequence implementation in Haskell, which supports at-least-log-
time indexing and at-least-log-time cons with a large base for the
logarithm? Can you do it without turning off array bounds checking
(either by using unsafe operations or low-level peeking and poking)
and without using an algebraic data type with O(f) constructors for
fanout of f? You can turn off bounds checks if your program encodes
static guarantees that indices cannot be out of bounds (there are a
couple of libraries to do this).
The spirit here is "Work in Haskell with safe operations and no FFI
except through safe libraries, but otherwise use any extensions you
like."
I actually think this *is* doable, but it touches a few areas where
Haskell doesn't presently do well (the bounds checking in particular
is a challenge). I threw in the bounds checking when I realized that
in fact the equivalent Java code is always bounds checked, and these
bounds checks are then optimized away where possible. Actually, I'd
*love* to see an *in*efficient solution to eliminating as many bounds
checks as possible!
-Jan
--
Live well,
~wren
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe