Hi all
Oleg, it seems to me that you're blurring two distinctions here.
(i) fixed-length versus unknown length (ii) cons-list versus array
There's no good reason why either the dependent-style vector
David Menendez wrote: ] data Vec n a where ] Nil :: Vec Zero a ] Cons :: a -> Vec n a -> Vec (Succ n) a
or the class-of-constructors representation (strangely familiar to me)
] data Vec_0 a = Nil ] data Vec_s v a = Cons a (v a)
S.M.Kahrs wrote: ] data Cons b a = a :. b a ] data NIL a = NIL
should eat any more memory or be harder to access than regular cons-lists. The length information is entirely static. This representation precisely does not suffer from the problem you quote with respect to nested types.
Nested types such as the AVL trees in
> http://www.haskell.org/pipermail/haskell/2003-April/011693.html
and the square matrix types suffer from the fact that their constructors must target all parameters uniformly, even if the recursive references are more specialised. You can only make the parameter bigger by going under a constructor, so if you want to make structures with size invariants, you get this prefix to build up the size which eats space and time. Here's a unary version of square matrices:
data Nil a = Nil data Cons b a = a :. b a
newtype Square a = Count Nil a data Count f a = Stop (f (f a)) | Step (Count (Cons f) a)
As you can see, it's the Stop and Step which take up the extra space, recording the size of the matrix, and also concealing it from the type. This structure represents square matrices of any size, ie
exists n. Vec (Vec n a)
with the Steps and Stop representing the n, and then the vector of vectors stored as sized cons-lists. If you want to write a multiplier for matrices, you need to expose these sizes, and that's where type classes help a bit and indexed datatype families help a lot.
For the same reason, it's easy to define the well-scoped lambda terms (Bird & Paterson 1999), because scope only gets bigger,
data Tm x = Var x | App (Tm x) (Tm x) | Lam (Tm (Maybe x))
but trickier to define the simply-typed lambda terms, where it helps to be able to say 'lambda makes (s -> t)' as you can with datatype families (Altenkirch & Reus, 1999) and now with GADTs. It's great to see this stuff making it into Haskell. There are plenty more examples in the literature that should shift over with minimal fuss.
Anyhow, when you say
to take the 1023d element of a 1024-element vector, we have to traverse 1023 Cons cells.
that's clearly to do with the cons-list representation, not the fixed length. Of course, I agree with you that the array representation of sequences is sometimes more appropriate than the cons-list representation, but the reverse is also true. It's good to have fixed-length versions of both.
The alternative is to encode the invariant -- the vector size, in our problem -- in *type* constructors, in phantom types. For example,
newtype PVector n a = PVector (Array Int a)
Vectors store their lengths implicitly, the way lists do; don't arrays store their bounds explicitly? But you're right, there's no reason to believe that the bound corresponds to the n...
So, we have to hide PVector in a module and expose only functions that build the values of PVector one way or the other. We also have to make sure that these functions are correct -- because they constitute the trusted kernel of our invariant.
Indeed we do, so we shall certainly have to work with sizes in a principled way inside the abstraction barrier, eg tabulation of Ashley's functions from finite sets (another traditional dependent vector representation), or whatever. Especially if we want to justify (at least to ourselves) the use of non-bound-checked accesses, which is surely part of the point here.
My point is just that the issue between inductive Vec and array PVector is just the cons-list versus array issue. It's not about how one works with sizes. In either approach, one inevitably comes up against the fact that sizes are numbers, ie a kind of data, and one would like them to behave accordingly. How hard the language makes it to treat these numbers like data is the measure of how well equipped it is to support this precise style of programming. The existence of crafty encodings cannot, in the long run, rescue an inappropriate design.
All the best
Conor
-- http://www.cs.nott.ac.uk/~ctm _______________________________________________ Haskell mailing list [email protected] http://www.haskell.org/mailman/listinfo/haskell
