Re: Designing a matrix library for D

2014-07-07 Thread Jared via Digitalmars-d


On Tuesday, 24 June 2014 at 03:29:52 UTC, H. S. Teoh via 
Digitalmars-d wrote:


Using lapack/blas as the backend would guarantee good 
performance,
though I'm not sure if the Phobos maintainers would be too 
thrilled at

the idea, after the fiasco with libcurl.



It is straightforward with version blocks to support multiple 
backends at compile time, even including a pure D version (lower 
performance but always works). One of our internal libraries 
supports either ATLAS/BLAS (for dev boxes) or MKL (for production 
big iron) in this way. We just change a makefile variable to use 
--version=ATLAS or --version=MKL.


J


Re: Designing a matrix library for D

2014-07-07 Thread w0rp via Digitalmars-d
I have implemented a couple of matrices which may be of interest. 
One is heap allocated with garbage collection, another is based 
on 2D static arrays. I didn't implement a spare matrix.


https://github.com/w0rp/dstruct/blob/master/source/dstruct/matrix.d

There's very little which can be shared between sparse matrices 
and full matrices, as they optimise for very different scenarios. 
I wouldn't attempt write too many functions which work on both.


The one very interesting thing I got out of working on the 
matrices is that you can get a reference to the 2D static array 
matrix as a 1D static array, which means that you can do the 
following.


Matrix!(float, M, N) matrix;

fillArrayInC(matrix.array1D.ptr, matrix.array1D.length);

That's the one interesting thing I'd take away from everything I 
did. The ability to do that in D might be useful for someone else 
some time.


Re: Designing a matrix library for D

2014-06-24 Thread Mason McGill via Digitalmars-d

On Tuesday, 24 June 2014 at 04:36:04 UTC, ed wrote:

On Monday, 23 June 2014 at 21:08:03 UTC, Mason McGill wrote:
[snip]


Concepts:
 InputGrid: anything with a size (size_t[n]) and n-dimensional 
opIndex.
 OutputGrid: anything with a size (size_t[n]) and 
n-dimensional opIndexAssign.

[snip]


Cheers,
Mason


I don't think 'Grid' is not a good name for the type. The term 
Grid is generally used to refer to any sort of grid; regular, 
semiregular, irregular, unstructured, curvilinear etc. etc. 
grids.


When you're talking about sampling on a grid, warping can either 
be thought of as warping the grid, or warping the function to be 
sampled. My library happens to work in terms of the second 
approach, so it is (pedantically) consistent. I see your point, 
though.


They are all very different in their use and internal 
representations. Often a grid will have support for metadata at 
each point (GIS etc.) whereas a matrix will not.


I think the word Matrix is a better fit, because that is what 
they these types are.


Not quite. Matrix connotes 2-dimensional tensors (in the linear 
algebra sense). The library I'm developing at work is for 
manipulating multidimensional arrays, which don't have the same 
semantics people expect from matrices (representing linear 
operators). However, the name Array already refers to at least 
3 data structures in D, so it was out. I picked Grid because 
the library is for processing data sampled regularly on 
n-dimensional grids (mostly vision). With my library, you could 
represent complex data in a GIS via a grid of structures, or 
multiple grids (layers).


If you're not convinced, I intend to release my work under a 
liberal license, so feel free to download it and find/replace 
Grid - Matrix :)


Also, check out this other option if you haven't already:
http://denis-sh.bitbucket.org/unstandard/unstd.multidimarray.html


Re: Designing a matrix library for D

2014-06-24 Thread bearophile via Digitalmars-d

David Bregman:

I didn't read the linked paper, but the performance quoted in 
the abstract comparable (from 48% to 100%) to that of 
lower-level Java programs sounds pretty underwhelming.


The design I've suggested is not that virtual/indirect, so the 
performance is better.


Bye,
bearophile


Re: Designing a matrix library for D

2014-06-24 Thread bearophile via Digitalmars-d

David Bregman:

I think this approach would also be good for D, at least if the 
idea is to target numeric computing people.


I am thinking about something slightly different, and with a 
semantics more similar to that Haskell library. I don't think D 
is going to compete with Julia and its high dynamism, so better 
to use the static compilation at its fullest, and catch more bugs 
at compile-time. That's why I was discussing about type system 
improvements, etc.


Bye,
bearophile


Re: Designing a matrix library for D

2014-06-24 Thread ed via Digitalmars-d

On Tuesday, 24 June 2014 at 07:01:14 UTC, Mason McGill wrote:

On Tuesday, 24 June 2014 at 04:36:04 UTC, ed wrote:

On Monday, 23 June 2014 at 21:08:03 UTC, Mason McGill wrote:
[snip]


Concepts:
InputGrid: anything with a size (size_t[n]) and n-dimensional 
opIndex.
OutputGrid: anything with a size (size_t[n]) and 
n-dimensional opIndexAssign.

[snip]


Cheers,
Mason


I don't think 'Grid' is not a good name for the type. The term 
Grid is generally used to refer to any sort of grid; regular, 
semiregular, irregular, unstructured, curvilinear etc. etc. 
grids.


When you're talking about sampling on a grid, warping can 
either be thought of as warping the grid, or warping the 
function to be sampled. My library happens to work in terms of 
the second approach, so it is (pedantically) consistent. I see 
your point, though.


They are all very different in their use and internal 
representations. Often a grid will have support for metadata 
at each point (GIS etc.) whereas a matrix will not.


I think the word Matrix is a better fit, because that is what 
they these types are.


Not quite. Matrix connotes 2-dimensional tensors (in the 
linear algebra sense). The library I'm developing at work is 
for manipulating multidimensional arrays, which don't have the 
same semantics people expect from matrices (representing linear 
operators). However, the name Array already refers to at 
least 3 data structures in D, so it was out. I picked Grid 
because the library is for processing data sampled regularly on 
n-dimensional grids (mostly vision). With my library, you could 
represent complex data in a GIS via a grid of structures, or 
multiple grids (layers).




Well you could but there are well established data structures 
that are optimal for each grid type so it would be better to just 
use one of those.


If you're not convinced, I intend to release my work under a 
liberal license, so feel free to download it and find/replace 
Grid - Matrix :)


Fair enough and I am only nitpicking. Your library sounds 
promising so I'll definitely give it a whirl when released.




Also, check out this other option if you haven't already:
http://denis-sh.bitbucket.org/unstandard/unstd.multidimarray.html


Yep, this is nice and I use it already, along with Sci-D.


Cheers,
ed


Re: Designing a matrix library for D

2014-06-24 Thread Andrei Alexandrescu via Digitalmars-d

On 6/23/14, 8:46 PM, Tofu Ninja wrote:

On Tuesday, 24 June 2014 at 03:29:52 UTC, H. S. Teoh via Digitalmars-d
wrote:


a compile-time DSL instead of hack tricks like expression


That sounds like a it will be very odd to use and most likely have very
little benefit. I think most people would agree that matrix math is a
good use case for operator overloading.


Just a thought - a nice thing would be to have a shell backed up by 
multiple engines (LAPACK, BLAS, SIMD-based, naive etc), similarly to 
what LINQ does with virtualizing data access. Then possessors of various 
linear algebra libraries would be able to link with them and use them.


Andrei



Designing a matrix library for D

2014-06-23 Thread bearophile via Digitalmars-d

I'd like a matrix library for D, able to handle sparse matrices
too. But first I'd like to discuss its API and general design a
bit.

---

I think a starting point for the design is visible in this paper,
Optimizing Array Accesses in High Productivity Languages by
Mackale Joyner et al.:
http://www.cs.rice.edu/~vs3/PDF/hpcc07-176.pdf

The paper is about optimization, but what's interesting for me is
that it also shows the API and usage of the matrices of the X10
language, that was designed for heavy numerical computations.

Those collections of X10 are based on points and regions.

A Point is a vector of N keys, that are indexes of a vector (like
the row,col for a matrix, or the key for an associative array).
If N is a compile-time constant we lose some flexibility but we
gain efficiency. In theory we want both.

A Region defines part of a collection. In the simpler case it's a
slice of a 1D array, or it can be a 2D rectangular region of a 2D
array, or more. So it's an array of M points. Usually you want M
to be a compile-time constant, but in some cases you want more
flexibility. In some cases you want to intersect regions, or you
want non-convex regions, so the situation gets a little more
complex.

The paper lists some Region operations, that I think can be
implemented with no changes to the D language:

R.rank ::= # dimensions in region;
R.size() ::= # points in region
R.contains(P) ::= predicate if region R contains point P
R.contains(S) ::= predicate if region R contains region S
R.equal(S) ::= true if region R and S contain same set of points
R.rank(i) ::= projection of region R on dimension i (a
one-dimensional region)
R.rank(i).low() ::= lower bound of i-th dimension of region R
R.rank(i).high() ::= upper bound of i-th dimension of region R
R.ordinal(P) ::= ordinal value of point P in region R
R.coord(N) ::= point in region R with ordinal value = N
R1  R2 ::= region intersection (will be rectangular if R1 and
R2 are rectangular)
R1 || R2 ::= union of regions R1 and R2 (may or may not be
rectangular,compact)
R1 - R2 ::= region difference (may or may not be
rectangular,compact)


The paper also shows some ideas for array operations:

A.rank ::= # dimensions in array
A.region ::= index region (domain) of array
A.distribution ::= distribution of array A
A[P] ::= element at point P, where P belongs to A.region
A | R ::= restriction of array onto region R (returns copy of
subarray)
A.sum(), A.max() ::= sum/max of elements in array
A1 op A2 ::= returns result of applying a point-wise op on
array elements,
when A1.region = A2. region
(op can include +, -, *, and / )
A1 || A2 ::= disjoint union of arrays A1 and A2
(A1.region and A2.region must be disjoint)
A1.overlay(A2) ::= array with region, A1.region || A2.region,
   with element value A2[P] for all points P in A2.region
   and A1[P] otherwise.

---

I've found a second source of ideas in this page:
http://dis.um.es/~alberto/hmatrix/static.html

It's a Haskell matrix library named hmatrix. It's meant to be
safe because it performs automatic inference and compile-time
checking of dimensions in matrix and vector operations.. This
means that in many cases arrays and matrices have a size known at
compile-time (so it's part of the type) and the library verifies
statically that they match:

(matrix
  [ 2.0, 0.0, -1.0
  , 1.0, 1.0,  7.0
  , 5.0, 3.0,  1.0
  , 2.0, 8.0,  0.0 ] :: L 4 3)


But:

The size of the result of certain computations can only be known 
at run time. For instance, the dimension of the nullspace of 
matrix depends on its rank, which is a nontrivial property of 
its elements:


The library solves this problem using a GHC compiler language
extension:

By hiding the unknown dimension in an existential type we can 
still compute safely with a strongly typed nullspace.


Docs about Haskell existential types:

http://www.haskell.org/haskellwiki/Existential_type

This is a normal Haskell record with specified types:

data Worker b x y = Worker {buffer :: b, input :: x, output :: y}

That is similar to this D code:

class Worker(B, X, Y) {
 B buffer;
 X input;
 Y output;
}


Existential types can be used to hide a type on the left:

data Worker x y = forall b. Buffer b = Worker {buffer :: b,
input :: x, output :: y}

This allows to hide the size of an array where its length is
not known at compile-time. I am not an Haskell expert, but I
think this is a kind of type erasure (where here the type is the
size of the array).

With this strategy in most cases the compiler is able to enforce
the arrays are of right types at compile-time, and allows
run-time sizes for the few remaining cases.

In D this idea should avoid template bloat caused by all the
different array dimensions, but allow optimizations based on the
compile-time knowledge of sizes in the points where the
programmer asks for more efficiency (like unrolling on request in
certain zones if the size is just 4).

Is all this possible 

Re: Designing a matrix library for D

2014-06-23 Thread H. S. Teoh via Digitalmars-d
On Mon, Jun 23, 2014 at 04:45:27PM +, bearophile via Digitalmars-d wrote:
 I'd like a matrix library for D, able to handle sparse matrices
 too. But first I'd like to discuss its API and general design a
 bit.

When it comes to libraries that need to handle diverse concrete types
(e.g. compact matrices vs. sparse matrices), my take is always to write
the matrix algorithms in a way that's independent of the concrete matrix
type(s), and to provide concrete types as a separate submodule (or
logically distinct section of the API).


[...]
 Those collections of X10 are based on points and regions.
 
 A Point is a vector of N keys, that are indexes of a vector (like
 the row,col for a matrix, or the key for an associative array).
 If N is a compile-time constant we lose some flexibility but we
 gain efficiency. In theory we want both.
 
 A Region defines part of a collection. In the simpler case it's a
 slice of a 1D array, or it can be a 2D rectangular region of a 2D
 array, or more. So it's an array of M points. Usually you want M
 to be a compile-time constant, but in some cases you want more
 flexibility. In some cases you want to intersect regions, or you
 want non-convex regions, so the situation gets a little more
 complex.

I think supporting non-convex (or non-rectangular) regions is a bit too
ambitious. Basic support should focus on rectangular/cuboidal/etc.
regions, which have the advantage that they are closed under
intersections (the intersection of two rectangular regions is always
another rectangular region). Non-rectangular regions would destroy a lot
of simplifying assumptions that can be made to streamline the design,
and yet most applications I can think of need only rectangular regions.
I don't think it's worth the price to support such unusual usages.

(Besides, if we design the API correctly, they should still be
supportable via wrapped matrix types -- this is why matric algorithms
should not depend on concrete types, so that for special needs the user
can simply define a wrapper that remaps points in such a way as to
achieve the desired non-convex regions.)


 The paper lists some Region operations, that I think can be
 implemented with no changes to the D language:
 
 R.rank ::= # dimensions in region;
 R.size() ::= # points in region
 R.contains(P) ::= predicate if region R contains point P
 R.contains(S) ::= predicate if region R contains region S
 R.equal(S) ::= true if region R and S contain same set of points
 R.rank(i) ::= projection of region R on dimension i (a
 one-dimensional region)
 R.rank(i).low() ::= lower bound of i-th dimension of region R
 R.rank(i).high() ::= upper bound of i-th dimension of region R
 R.ordinal(P) ::= ordinal value of point P in region R
 R.coord(N) ::= point in region R with ordinal value = N
 R1  R2 ::= region intersection (will be rectangular if R1 and
 R2 are rectangular)
 R1 || R2 ::= union of regions R1 and R2 (may or may not be
 rectangular,compact)
 R1 - R2 ::= region difference (may or may not be
 rectangular,compact)

My current thoughts on this, is that concrete matrix/array types should
only support array element access and size measurements. Operations like
intersections, subarrays, etc., should be done via wrapper types that
remap element access and size measurements, but forward element accesses
to the underlying concrete type. This way, we avoid excessive copying,
which get very expensive with large or high-dimensional arrays.


 The paper also shows some ideas for array operations:
 
 A.rank ::= # dimensions in array
 A.region ::= index region (domain) of array
 A.distribution ::= distribution of array A

What's a distribution?


 A[P] ::= element at point P, where P belongs to A.region
 A | R ::= restriction of array onto region R (returns copy of
 subarray)

IMO, we should avoid all implicit copying. Let the user explicitly ask
for a copy if a copy is desired. Implicit copying is the mortal enemy of
high performance.


 A.sum(), A.max() ::= sum/max of elements in array
 A1 op A2 ::= returns result of applying a point-wise op on
 array elements,
 when A1.region = A2. region
 (op can include +, -, *, and / )
 A1 || A2 ::= disjoint union of arrays A1 and A2
 (A1.region and A2.region must be disjoint)
 A1.overlay(A2) ::= array with region, A1.region || A2.region,
with element value A2[P] for all points P in A2.region
and A1[P] otherwise.

All of these should be generic functions / wrappers, I think.


[...]
 I've found a second source of ideas in this page:
 http://dis.um.es/~alberto/hmatrix/static.html
[...]
 The size of the result of certain computations can only be known at
 run time. For instance, the dimension of the nullspace of matrix
 depends on its rank, which is a nontrivial property of its elements:
 
 The library solves this problem using a GHC compiler language
 extension:
 
 By hiding the unknown dimension in an existential type we can still
 compute safely with a strongly typed nullspace.
[...]
 With this 

Re: Designing a matrix library for D

2014-06-23 Thread bearophile via Digitalmars-d

H. S. Teoh:


Operations like intersections, subarrays, etc., should be done
via wrapper types


A Region is not an array of data, it's more like a 
[(x1,x2),(y1,y2)]. You can allocate an array given a Region, you 
can iterate a Region, etc.




A.distribution ::= distribution of array A


What's a distribution?


It's a concept of higher level compared to Regions. It's less 
efficient than Regions, but it's more flexible. It allows to 
decouple the Point from the location of the data memory. So you 
can have a distribution that maps Points to various different 
concrete arrays.


http://x10.sourceforge.net/x10doc/2.2.1/x10/array/DistArray.html
http://x10.sourceforge.net/x10doc/2.2.1/x10/array/Dist.html

I have not fully understood what a Distribution is.

Bye,
bearophile


Re: Designing a matrix library for D

2014-06-23 Thread Wyatt via Digitalmars-d

On Monday, 23 June 2014 at 16:45:28 UTC, bearophile wrote:

I'd like a matrix library for D, able to handle sparse matrices
too. But first I'd like to discuss its API and general design a
bit.

This all reminds me, I was thinking I'd like to try adding some 
array language features to D at some point.  Things APL's 
monadics and dyadics, and generic lifting of functions and 
operators as described here: 
http://www.ccs.neu.edu/home/pete/pub/esop-2014.pdf


It's been pretty low on my priority list, but it sounds similar 
to some of what you're laying out here.


-Wyatt


Re: Designing a matrix library for D

2014-06-23 Thread bearophile via Digitalmars-d

H. S. Teoh:

The key to success, then, lies in how we design the generic 
matrix API. If we do it right, then it should be easy to

implement features like compile-time size verifications,
etc.. Otherwise it can be very painful.


One point of my post was to see if it could be useful to extend 
and improve the current D type system. Apparently you are saying 
this is not needed for efficiency reasons (but perhaps some type 
system improvements could be useful to reduce template bloat with 
the usage of this matrix library. The idea is to perform the 
compile-time tests of the sizes, and then remove such types in 
most cases, avoiding the template bloat. There are manual ways to 
do this, but here it's better to receive help from the type 
system).


Bye,
bearophile


Re: Designing a matrix library for D

2014-06-23 Thread Mason McGill via Digitalmars-d

On Monday, 23 June 2014 at 16:45:28 UTC, bearophile wrote:
I'm glad to hear there's interest in this! I've actually been 
working on a generic multidimensional array library myself (on 
and off) for the past few months (I'm hoping to have a 
well-documented dub package by the end of the summer). It's a bit 
of a hybrid between Julia's `AbstractArray`, NumPy's `ndarray`, 
Numba, and std.range, and I think it has a very D feel.


It is, however, designed for processing sampled signals (rather 
than matrix algebra), which may be more or less in line with 
different people's interests.


Concepts:
  InputGrid: anything with a size (size_t[n]) and n-dimensional 
opIndex.
  OutputGrid: anything with a size (size_t[n]) and n-dimensional 
opIndexAssign.


Functionality:
  - Lazy `map` and `reduce`.
  - Lazy `zip` with built-in broadcasting.
  - Lazy n-dimensional convolution and cross-correlation.
  - Lazy sampling and interpolation.
  - `collect` to force evaluation.
  - Element-wise operators and and fancy indexing via mixins*.
  - Everyone's favorite MATLAB-style constructors
(`zeros`, `identity`, `meshGrid`, etc.)
  - Audio/Video and JSON I/O, via GStreamer (though this
probably ought to be a separate package).

*Because D doesn't allow UFCS for operators, which is silly, but 
that's the way it is.


Other features:
  - Compatibility with an automatic binding
generator I've been working on.
  - Testing, because I've been using this at work :)

If people want to use this, it might make sense to add linear 
algebra to this package rather than building a separate, 
incompatible algebra library (see this thread: 
http://comments.gmane.org/gmane.comp.lang.julia.devel/14533).


I did a lot of thinking about the level of abstraction that would 
be the most useful, and I settled on something like Julia's 
`AbstractArray` because it fits into D nicely (like ranges, but 
different!) and easily maps onto something that can be passed to 
other languages (a block of memory with size/strides metadata). 
As in Julia, I decided to make rank (`nDim!grid`) a compile-time 
entity since


1) Writing rank-branching code for NumPy is cumbersome and error 
prone; rank should be part of overload resolution.
2) It speeds up component-programming-style operations without 
requiring leaky abstractions (having to specialize on concrete 
grid types).


It's a busy week so I probably won't be able to follow this 
thread, but I wanted to let you know I was working on this to 
avoid duplicated efforts, etc.


Cheers,
Mason


Re: Designing a matrix library for D

2014-06-23 Thread H. S. Teoh via Digitalmars-d
On Mon, Jun 23, 2014 at 09:08:02PM +, Mason McGill via Digitalmars-d wrote:
 On Monday, 23 June 2014 at 16:45:28 UTC, bearophile wrote:
 I'm glad to hear there's interest in this! I've actually been working
 on a generic multidimensional array library myself (on and off) for
 the past few months (I'm hoping to have a well-documented dub package
 by the end of the summer). It's a bit of a hybrid between Julia's
 `AbstractArray`, NumPy's `ndarray`, Numba, and std.range, and I think
 it has a very D feel.

You are far from the first one to write a multidimensional array library
in D. Denis has already done this some time ago, and I have, too. It's
clear that we need to get our heads together and knock up something
Phobos-worthy.


 It is, however, designed for processing sampled signals (rather than
 matrix algebra), which may be more or less in line with different
 people's interests.

I've thought about this a lot, and my conclusion is that matrix algebra
is best left to a matrix-specific library wrapper that endows matrix
algebra properties on top of a generic multidimensional array type. The
main reason is that matrix algebra is rather peculiar to matrices
specifically, particularly the non-commutative matrix product, and does
not have very many useful usages outside of matrix algebra itself; yet
multidimensional arrays in general are useful in a wider scope.

So in my mind, there are several components that are needed here:
- A generic multidimensional array, as a data container. This can be a
  set of diverse concrete types, e.g. to represent sparse matrices,
  etc.. No domain-specific operations are specified for these
  containers; they are purely for storage and access only.

- Specialized array types, like matrices which add matrix algebra
  semantics to the underlying multidimensional array storage container
  (e.g., matrix product), or tensor wrappers (endows tensor product
  semantics on the underlying container).

- Algorithms: these take any multidimensional array, or specialized
  matrix type (depending on the purpose of each algorithm), and operate
  on them using a common multidimensional array API. They should be as
  generic as they can be, but no more, e.g., LU decomposition algorithms
  would take a matrix type rather than the general multidimensional
  array type, because it is specific to matrices, whereas per-element
  summation would take any general multidimensional array type, since it
  doesn't need matrix-specific properties. Similarly, subarray
  operations should be able to work with any multidimensional array
  type, since it simply provides a restricted view on the underlying
  storage container, but doesn't depend on the implementational
  specifics of the container itself.


[...]
 I did a lot of thinking about the level of abstraction that would be
 the most useful, and I settled on something like Julia's
 `AbstractArray` because it fits into D nicely (like ranges, but
 different!) and easily maps onto something that can be passed to
 other languages (a block of memory with size/strides metadata). As in
 Julia, I decided to make rank (`nDim!grid`) a compile-time entity
 since
 
 1) Writing rank-branching code for NumPy is cumbersome and error prone; rank
 should be part of overload resolution.
 2) It speeds up component-programming-style operations without requiring
 leaky abstractions (having to specialize on concrete grid types).
 
 It's a busy week so I probably won't be able to follow this thread, but I
 wanted to let you know I was working on this to avoid duplicated efforts,
 etc.
[...]

I think the best solution would be a generic multidimensional concept
(analogous to the input range concept) that will fit all of our
multidimensional array implementations. Algorithms that can work with
diverse ranks shouldn't care if the rank of a particular concrete type
is compile-time or runtime, for example. Let the algorithms be
adaptable (up to practicality, of course -- if an algo can't work or
works very poorly with dynamic rank, then just add a sig constraint that
requires static rank, for example), and the user can choose which
concrete type(s) best suits his needs.

The concrete types can then be provided by a separate module, and if we
do it right, should be interoperable with each other.


T

-- 
How are you doing? Doing what?


Re: Designing a matrix library for D

2014-06-23 Thread H. S. Teoh via Digitalmars-d
On Mon, Jun 23, 2014 at 09:09:03PM +, bearophile via Digitalmars-d wrote:
 H. S. Teoh:
 
 The key to success, then, lies in how we design the generic matrix
 API. If we do it right, then it should be easy to implement features
 like compile-time size verifications, etc.. Otherwise it can be very
 painful.
 
 One point of my post was to see if it could be useful to extend and
 improve the current D type system. Apparently you are saying this is
 not needed for efficiency reasons (but perhaps some type system
 improvements could be useful to reduce template bloat with the usage
 of this matrix library. The idea is to perform the compile-time tests
 of the sizes, and then remove such types in most cases, avoiding the
 template bloat. There are manual ways to do this, but here it's better
 to receive help from the type system).
[...]

I'm not against using / extending the type system to improve performance
or reduce template bloat.  But it would be far more ambitious than to
implement something that works with the current language.


T

-- 
Laissez-faire is a French term commonly interpreted by Conservatives to mean 
'lazy fairy,' which is the belief that if governments are lazy enough, the Good 
Fairy will come down from heaven and do all their work for them.


Re: Designing a matrix library for D

2014-06-23 Thread Mason McGill via Digitalmars-d
On Monday, 23 June 2014 at 21:33:40 UTC, H. S. Teoh via 
Digitalmars-d wrote:
You are far from the first one to write a multidimensional 
array library
in D. Denis has already done this some time ago, and I have, 
too.


I've seen Denis' library. It seems nice, but it's a lot more 
concrete that what the original post was discussing. Is yours 
online somewhere (for inspiration)?


I've thought about this a lot, and my conclusion is that matrix 
algebra
is best left to a matrix-specific library wrapper that endows 
matrix
algebra properties on top of a generic multidimensional array 
type. The
main reason is that matrix algebra is rather peculiar to 
matrices
specifically, particularly the non-commutative matrix product, 
and does
not have very many useful usages outside of matrix algebra 
itself; yet

multidimensional arrays in general are useful in a wider scope.

So in my mind, there are several components that are needed 
here:
- A generic multidimensional array, as a data container. This 
can be a
  set of diverse concrete types, e.g. to represent sparse 
matrices,

  etc.. No domain-specific operations are specified for these
  containers; they are purely for storage and access only.

- Specialized array types, like matrices which add matrix 
algebra
  semantics to the underlying multidimensional array storage 
container
  (e.g., matrix product), or tensor wrappers (endows tensor 
product

  semantics on the underlying container).

- Algorithms: these take any multidimensional array, or 
specialized
  matrix type (depending on the purpose of each algorithm), and 
operate
  on them using a common multidimensional array API. They 
should be as
  generic as they can be, but no more, e.g., LU decomposition 
algorithms
  would take a matrix type rather than the general 
multidimensional
  array type, because it is specific to matrices, whereas 
per-element
  summation would take any general multidimensional array type, 
since it

  doesn't need matrix-specific properties. Similarly, subarray
  operations should be able to work with any multidimensional 
array
  type, since it simply provides a restricted view on the 
underlying

  storage container, but doesn't depend on the implementational
  specifics of the container itself.


Seems reasonable.

I think the best solution would be a generic multidimensional 
concept

(analogous to the input range concept) that will fit all of our
multidimensional array implementations. Algorithms that can 
work with
diverse ranks shouldn't care if the rank of a particular 
concrete type

is compile-time or runtime, for example. Let the algorithms be
adaptable (up to practicality, of course -- if an algo can't 
work or

works very poorly with dynamic rank, then just add a
constraint that
requires static rank, for example), and the user can choose 
which

concrete type(s) best suits his needs.


One possible issue with dynamically-ranked grids is 
concept-testing. `isInputGrid!Collection` needs to check if 
`Collection` has a `size` property, and can be indexed with 
`size.length` indices. If this is to be done at compile-time, 
`size.length` needs to be constant.


The concrete types can then be provided by a separate module, 
and if we

do it right, should be interoperable with each other.


Definitely an important design goal. You seem quite knowledgeable 
on this topic and I hope we can continue this conversation when 
my work on this is further along.


Re: Designing a matrix library for D

2014-06-23 Thread David Bregman via Digitalmars-d
Have you looked at armadillo for inspiration? 
http://arma.sourceforge.net/


It's a C++ matrix/linear algebra library which has:
-expression templates for high level optimizations (could be done 
even better in D of course, moving more stuff to compile time)

-high performance via lapack/blas
-deliberately matlab-like syntax which is appealing to a large 
subset of people who would use such a library


I think this approach would also be good for D, at least if the 
idea is to target numeric computing people. I am skeptical about 
making the library too generic, since performance is typically a 
primary concern for matrix code. I didn't read the linked paper, 
but the performance quoted in the abstract comparable (from 48% 
to 100%) to that of lower-level Java programs sounds pretty 
underwhelming.


Re: Designing a matrix library for D

2014-06-23 Thread H. S. Teoh via Digitalmars-d
On Tue, Jun 24, 2014 at 03:04:53AM +, David Bregman via Digitalmars-d wrote:
 Have you looked at armadillo for inspiration? http://arma.sourceforge.net/
 
 It's a C++ matrix/linear algebra library which has:
 -expression templates for high level optimizations (could be done even
 better in D of course, moving more stuff to compile time)
 -high performance via lapack/blas
 -deliberately matlab-like syntax which is appealing to a large subset
 of people who would use such a library
 
 I think this approach would also be good for D, at least if the idea
 is to target numeric computing people. I am skeptical about making the
 library too generic, since performance is typically a primary concern
 for matrix code. I didn't read the linked paper, but the performance
 quoted in the abstract comparable (from 48% to 100%) to that of
 lower-level Java programs sounds pretty underwhelming.

Using lapack/blas as the backend would guarantee good performance,
though I'm not sure if the Phobos maintainers would be too thrilled at
the idea, after the fiasco with libcurl.

For high performance, I would implement matrix algebra completely within
a compile-time DSL instead of hack tricks like expression templates and
operator overloading. That way, the implementation can algebraically
manipulate the expressions to maximize opportunities for optimizations.

As for genericity, I think D has the ability to generate performant code
*and* be generic at the same time. At least, it'd worth shooting for to
see how far we can get. I believe the key is to separate the algorithms
from the concrete types, and use specialization for the
performance-sensitive cases. This way, the generic cases will work, and
the specific cases will be high-performance.


T

-- 
Nothing in the world is more distasteful to a man than to take the path that 
leads to himself. -- Herman Hesse


Re: Designing a matrix library for D

2014-06-23 Thread Tofu Ninja via Digitalmars-d
On Tuesday, 24 June 2014 at 03:29:52 UTC, H. S. Teoh via 
Digitalmars-d wrote:



a compile-time DSL instead of hack tricks like expression


That sounds like a it will be very odd to use and most likely 
have very little benefit. I think most people would agree that 
matrix math is a good use case for operator overloading.


Re: Designing a matrix library for D

2014-06-23 Thread ed via Digitalmars-d

On Monday, 23 June 2014 at 21:08:03 UTC, Mason McGill wrote:
[snip]


Concepts:
  InputGrid: anything with a size (size_t[n]) and n-dimensional 
opIndex.
  OutputGrid: anything with a size (size_t[n]) and 
n-dimensional opIndexAssign.

[snip]


Cheers,
Mason


I don't think 'Grid' is not a good name for the type. The term 
Grid is generally used to refer to any sort of grid; regular, 
semiregular, irregular, unstructured, curvilinear etc. etc. 
grids. They are all very different in their use and internal 
representations. Often a grid will have support for metadata at 
each point (GIS etc.) whereas a matrix will not.


I think the word Matrix is a better fit, because that is what 
they these types are.


Re: Designing a matrix library for D

2014-06-23 Thread H. S. Teoh via Digitalmars-d
On Tue, Jun 24, 2014 at 03:46:09AM +, Tofu Ninja via Digitalmars-d wrote:
 On Tuesday, 24 June 2014 at 03:29:52 UTC, H. S. Teoh via Digitalmars-d
 wrote:
 
 a compile-time DSL instead of hack tricks like expression
 
 That sounds like a it will be very odd to use and most likely have
 very little benefit. I think most people would agree that matrix math
 is a good use case for operator overloading.

The DSL can look exactly like a regular matrix expression.

The case against operator overloading, is that unless you resort to
hacks like expression templates, the result will likely have poor
performance, because the overloaded operator can only see one or two
arguments at a time, whereas a compile-time DSL interpreter can analyze
the entire expression and use algebraic rules to rewrite it in a form
that can be efficiently implemented, eg., in terms of lapack/blas
primitives for computing expressions like A*x + y. Doing this sort of
rewriting with expression templates would be a maintenance nightmare.


T

-- 
Lawyer: (n.) An innocence-vending machine, the effectiveness of which depends 
on how much money is inserted.