I'd like a matrix library for D, able to handle sparse matrices too. But first I'd like to discuss its API and general design a bit.
--------------------------- I think a starting point for the design is visible in this paper, "Optimizing Array Accesses in High Productivity Languages" by Mackale Joyner et al.: http://www.cs.rice.edu/~vs3/PDF/hpcc07-176.pdf The paper is about optimization, but what's interesting for me is that it also shows the API and usage of the matrices of the X10 language, that was designed for heavy numerical computations. Those collections of X10 are based on "points" and "regions". A Point is a vector of N keys, that are indexes of a vector (like the row,col for a matrix, or the key for an associative array). If N is a compile-time constant we lose some flexibility but we gain efficiency. In theory we want both. A Region defines part of a collection. In the simpler case it's a slice of a 1D array, or it can be a 2D rectangular region of a 2D array, or more. So it's an array of M points. Usually you want M to be a compile-time constant, but in some cases you want more flexibility. In some cases you want to intersect regions, or you want non-convex regions, so the situation gets a little more complex. The paper lists some Region operations, that I think can be implemented with no changes to the D language: R.rank ::= # dimensions in region; R.size() ::= # points in region R.contains(P) ::= predicate if region R contains point P R.contains(S) ::= predicate if region R contains region S R.equal(S) ::= true if region R and S contain same set of points R.rank(i) ::= projection of region R on dimension i (a one-dimensional region) R.rank(i).low() ::= lower bound of i-th dimension of region R R.rank(i).high() ::= upper bound of i-th dimension of region R R.ordinal(P) ::= ordinal value of point P in region R R.coord(N) ::= point in region R with ordinal value = N R1 && R2 ::= region intersection (will be rectangular if R1 and R2 are rectangular) R1 || R2 ::= union of regions R1 and R2 (may or may not be rectangular,compact) R1 - R2 ::= region difference (may or may not be rectangular,compact) The paper also shows some ideas for array operations: A.rank ::= # dimensions in array A.region ::= index region (domain) of array A.distribution ::= distribution of array A A[P] ::= element at point P, where P belongs to A.region A | R ::= restriction of array onto region R (returns copy of subarray) A.sum(), A.max() ::= sum/max of elements in array A1 <op> A2 ::= returns result of applying a point-wise op on array elements, when A1.region = A2. region (<op> can include +, -, *, and / ) A1 || A2 ::= disjoint union of arrays A1 and A2 (A1.region and A2.region must be disjoint) A1.overlay(A2) ::= array with region, A1.region || A2.region, with element value A2[P] for all points P in A2.region and A1[P] otherwise. --------------------------- I've found a second source of ideas in this page: http://dis.um.es/~alberto/hmatrix/static.html It's a Haskell matrix library named "hmatrix". It's meant to be "safe" because it performs "automatic inference and compile-time checking of dimensions in matrix and vector operations.". This means that in many cases arrays and matrices have a size known at compile-time (so it's part of the type) and the library verifies statically that they match: (matrix [ 2.0, 0.0, -1.0 , 1.0, 1.0, 7.0 , 5.0, 3.0, 1.0 , 2.0, 8.0, 0.0 ] :: L 4 3) But:
The size of the result of certain computations can only be known at run time. For instance, the dimension of the nullspace of matrix depends on its rank, which is a nontrivial property of its elements:<
The library solves this problem using a GHC compiler language extension:
By hiding the unknown dimension in an existential type we can still compute safely with a strongly typed nullspace.<
Docs about Haskell existential types: http://www.haskell.org/haskellwiki/Existential_type This is a normal Haskell record with specified types: data Worker b x y = Worker {buffer :: b, input :: x, output :: y} That is similar to this D code: class Worker(B, X, Y) { B buffer; X input; Y output; } Existential types can be used to "hide" a type on the left: data Worker x y = forall b. Buffer b => Worker {buffer :: b, input :: x, output :: y} This allows to "hide" the size of an array where its length is not known at compile-time. I am not an Haskell expert, but I think this is a kind of type erasure (where here the type is the size of the array). With this strategy in most cases the compiler is able to enforce the arrays are of right types at compile-time, and allows run-time sizes for the few remaining cases. In D this idea should avoid template bloat caused by all the different array dimensions, but allow optimizations based on the compile-time knowledge of sizes in the points where the programmer asks for more efficiency (like unrolling on request in certain zones if the size is just 4). Is all this possible with the type system of D? Perhaps it's possible with the enum preconditions or some of the stuff discussed in the "Value range propagation for if-else" thread. Improving the D type system could be an option, if useful. Bye, bearophile