Re: Arrays vs Lists and Specialization

2003-01-22 Thread Ketil Z. Malde
Matthew Donadio [EMAIL PROTECTED] writes:

 OK, my question then has to do with the efficiency of lists versus
 arrays.  Do the latest compilers handle handle arrays efficiently, or
 are lists really the way to go?  

I've currently struggled a bit with arrays.  I have a list based
program (calculating suffix arrays, since you ask), and since I
experience a notably lower performance than array based C equivalents,
I thought using arrays would help me out.

Currently, I've been able to use arrays efficiently as read-only data
structures. I've tried to use STArrays to do updates imperatively, but
it's still slow, and uses a lot of memory (that doesn't show up in the
heap profiling).  I'll try to wrap more of the program in the ST
monad, to see if it helps.

 If there is a performace difference, is it generally big enough to
 warrant rewriting algorithms?

I think it is hard to answer that generally.  For some algorithms, the
benefit can be significant; it depends on your application, your data
set, and your resources. 

But remember that correct is better than fast, and readable is better
than correct. :-)

-kzm
-- 
If I haven't seen further, it is by standing in the footprints of giants
___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



RE: Arrays vs Lists and Specialization

2003-01-22 Thread Simon Peyton-Jones
Matthew

| Many spectral estimation routines are defined in terms of special
| matrices (ie, Toeplitz, etc).  Arrays defined recursively by list
| comprehensions make it easy to implement algorithms like
Levinson-Durbin
| recursion, and they look very similar to the mathematical definitions:
| 
|  levinson r p = (array (1,p) [ (k, a!(p,k)) | k - [1..p] ], realPart
(rho!p))
|  where a   = array ((1,1),(p,p)) [ ((k,i), ak k i) | k - [1..p],
i - [1..k] ]
|rho = array (1,p) [ (k, rhok k) | k - [1..p] ]
|ak 1 1 = -r!1 / r!0
|ak k i | k==i  = -(r!k + sum [ a!(k-1,l) * r!(k-l) | l -
[1..(k-1)] ]) /
| rho!(k-1)
|   | otherwise = a!(k-1,i) + a!(k,k) * (conjugate
(a!(k-1,k-i)))
|rhok 1 = (1 - (abs (a!(1,1)))^2) * r!0
|rhok k = (1 - (abs (a!(k,k)))^2) * rho!(k-1)
| 
| OK, my question then has to do with the efficiency of lists versus
| arrays.  Do the latest compilers handle handle arrays efficiently, or
| are lists really the way to go?  If there is a performace difference,
is
| it generally big enough to warrant rewriting algorithms?

Do not rewrite it!  This is a textbook application of Haskell arrays,
and they should work beautifully.  If not, we should fix it.  Using
lists will probably be much *less* efficient than arrays.

People often use arrays with algorithms derived from imperative
programming, which assume update-in-place.   They use (\\) for each
element, which copies the entire array, and get terrible performance.

Haskell arrays are designed to be build en-bloc, as you are doing it.
Your programs are lovely, and will not do redundant array copying.

It is, however, possible that GHC fails to deforest away all the
intermediate lists in the array comprehensions, but I think it does a
reasonable job.  You can use -ddump-simpl to see.


| A related question is how is specilization handled in arrays with lazy
| evaluation.  In the definition of levinson above, the a array is
defined
| in terms of the ak function.  By doing this, you save some horizontal
| space, but it also unburdens the programmer from tracking the
recursive
| dependencies.  a!(k,k) is needed before a!(i,j) can be calculated, but
| lazy evaluation takes care of this.

Yes.  The efficiency penalty is that every element is built as a thunk
and only later evaluated on demand.

| If the above function is
| specialized for r::Array Int (Complex Double) and p::Int, would I be
| correct to say that the final value of the function would be unboxed,
| but all intermediate values wouldn't? 

No, the function returns an array, which is always boxed.  Its
*elements* may be unboxed, but only if you, the programmer, specify them
to be unboxed.  GHC will not unbox array elements automatically, because
that makes them strict, and that changes the semantics.

To use unboxed arrays, you need the UArray library (documented in GHC's
libraries).  If the algorithm does not use laziness, you can often unbox
array elements just by changing Array to UArray in the types.  Easy.
But that makes it strict, and that may make your algorithm fail.  No
magic here.

| Now, in some cases, a user may
| need all of the model orders from 1..p.  This is handled easilly
enough
| by just changing the first line. to
| 
|  levinson r p = (a, fmap realPart rho)
| 
| Would the a matrix in the tuple be unboxed with specilization?

Not sure what you mean here, but I don't think so.

| If anyone is interesting in what I have put together, I will be making
| everything public sometime next week.  I have a lot of algorithms
| implemented, but I need to clean up the documentation a bit (well, a
| lot).

Would you like to write it up as a paper for the Journal of Functional
Programming? 

By the way, do you know of David Goblirsch's work
http://delivery.acm.org/10.1145/33/326801/p425-goblirsch.pdf?key1=32
6801key2=1086223401coll=GUIDEdl=GUIDECFID=7058054CFTOKEN=55810940

I'm not sure where he is now.  

Simon
___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Re: Arrays vs Lists and Specialization

2003-01-22 Thread Bjorn Lisper
Matthew Donadio:
| Many spectral estimation routines are defined in terms of special
| matrices (ie, Toeplitz, etc).  Arrays defined recursively by list
| comprehensions make it easy to implement algorithms like
Levinson-Durbin
| recursion, and they look very similar to the mathematical definitions:
| 
|  levinson r p = (array (1,p) [ (k, a!(p,k)) | k - [1..p] ], realPart
(rho!p))
|  where a   = array ((1,1),(p,p)) [ ((k,i), ak k i) | k - [1..p],
i - [1..k] ]
|   rho = array (1,p) [ (k, rhok k) | k - [1..p] ]
|   ak 1 1 = -r!1 / r!0
|   ak k i | k==i  = -(r!k + sum [ a!(k-1,l) * r!(k-l) | l -
[1..(k-1)] ]) /
| rho!(k-1)
|  | otherwise = a!(k-1,i) + a!(k,k) * (conjugate
(a!(k-1,k-i)))
|   rhok 1 = (1 - (abs (a!(1,1)))^2) * r!0
|   rhok k = (1 - (abs (a!(k,k)))^2) * rho!(k-1)
| 
| OK, my question then has to do with the efficiency of lists versus
| arrays.  Do the latest compilers handle handle arrays efficiently, or
| are lists really the way to go?  If there is a performace difference,
is
| it generally big enough to warrant rewriting algorithms?

Simon:

Do not rewrite it!  This is a textbook application of Haskell arrays,
and they should work beautifully.  If not, we should fix it.  Using
lists will probably be much *less* efficient than arrays.

People often use arrays with algorithms derived from imperative
programming, which assume update-in-place.   They use (\\) for each
element, which copies the entire array, and get terrible performance.

Haskell arrays are designed to be build en-bloc, as you are doing it.
Your programs are lovely, and will not do redundant array copying.

It is, however, possible that GHC fails to deforest away all the
intermediate lists in the array comprehensions, but I think it does a
reasonable job.  You can use -ddump-simpl to see.

You could check out SAC. SAC is a functional language specialized for
arrays, with restrictions in the language to ensure efficient compilation.
The SAC compiler can apparently do a very good job on scheduling/memory
allocation as to optimize for cache performance: this is an important issue
to get performance in array computations that I believe current Haskell
implementations do not bother much about.

SAC has syntactic forms to define arrays that are somewhat reminiscent of
array comprehensions, although they are much more restricted. An interesting
route for a *real* array-optimizing Haskell compiler could be to try to find
instances of array comprehensions that can be mapped to constructs
corresponding to SAC's constructs, and then apply the compilation technology
of SAC. An alternative is to make such constructs directly available in
Haskell, so the programmer can be explicit whan performance is needed.

Homepage http://www.informatik.uni-kiel.de/~sacbase/.

Björn Lisper
___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Re: Arrays vs Lists and Specialization

2003-01-22 Thread Hal Daume III
 SAC has syntactic forms to define arrays that are somewhat reminiscent of
 array comprehensions, although they are much more restricted. An interesting
 route for a *real* array-optimizing Haskell compiler could be to try to find
 instances of array comprehensions that can be mapped to constructs
 corresponding to SAC's constructs, and then apply the compilation technology
 of SAC. An alternative is to make such constructs directly available in
 Haskell, so the programmer can be explicit whan performance is needed.

Also related is Manuel Chakravarty et al's array fusion
(http://www.cse.unsw.edu.au/~chak/papers/CK01.html) which hopefully will
introduce array comprehensions into Haskell shortly (see
http://www.cse.unsw.edu.au/~chak/afp02/).


___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Arrays vs Lists and Specialization

2003-01-21 Thread Matthew Donadio
Hi all,

I'm sorry if this topic has been rehashed a lot, but I poked around in
the mailing list archive, and didn't find what I was looking for.

I currently have some free time on my hands, and have been implementing
some digital signal processing and spectral/frequency estimation
algorithms along with the needed matrix routines in Haskell.  

For those unfamiliar with this field, most algorthms are defined in
textbooks in terms of indexing through discrete sequences.  For example
the implementation of cross-correlation

 rxy x y k | k = 0 = sum [ x!(i+k) * (conjugate (y!i)) | i - [0..(n-1-k)] ]
   | k   0 = conjugate (rxy y x (-k))
  where n = snd (bounds x) + 1

looks very similar to the textbook definition if one uses arrays.  A
definition using lists would probably use drop, map, and zipWith, and
look nothing like the definitions found in the standard texts.

Many spectral estimation routines are defined in terms of special
matrices (ie, Toeplitz, etc).  Arrays defined recursively by list
comprehensions make it easy to implement algorithms like Levinson-Durbin
recursion, and they look very similar to the mathematical definitions:

 levinson r p = (array (1,p) [ (k, a!(p,k)) | k - [1..p] ], realPart (rho!p))
 where a   = array ((1,1),(p,p)) [ ((k,i), ak k i) | k - [1..p], i - [1..k] ]
 rho = array (1,p) [ (k, rhok k) | k - [1..p] ]
 ak 1 1 = -r!1 / r!0
 ak k i | k==i  = -(r!k + sum [ a!(k-1,l) * r!(k-l) | l - [1..(k-1)] ]) 
/ rho!(k-1)
| otherwise = a!(k-1,i) + a!(k,k) * (conjugate (a!(k-1,k-i)))
 rhok 1 = (1 - (abs (a!(1,1)))^2) * r!0
 rhok k = (1 - (abs (a!(k,k)))^2) * rho!(k-1)

This could be rewritten for lists, but would probably need to be defined
in terms of an aux. recursive function, which destroys the simplicity of
the above definition.

OK, my question then has to do with the efficiency of lists versus
arrays.  Do the latest compilers handle handle arrays efficiently, or
are lists really the way to go?  If there is a performace difference, is
it generally big enough to warrant rewriting algorithms?

A related question is how is specilization handled in arrays with lazy
evaluation.  In the definition of levinson above, the a array is defined
in terms of the ak function.  By doing this, you save some horizontal
space, but it also unburdens the programmer from tracking the recursive
dependencies.  a!(k,k) is needed before a!(i,j) can be calculated, but
lazy evaluation takes care of this.  If the above function is
specialized for r::Array Int (Complex Double) and p::Int, would I be
correct to say that the final value of the function would be unboxed,
but all intermediate values wouldn't?  Now, in some cases, a user may
need all of the model orders from 1..p.  This is handled easilly enough
by just changing the first line. to 

 levinson r p = (a, fmap realPart rho)

Would the a matrix in the tuple be unboxed with specilization?

If anyone is interesting in what I have put together, I will be making
everything public sometime next week.  I have a lot of algorithms
implemented, but I need to clean up the documentation a bit (well, a
lot).

Thanks.

-- 
Matthew Donadio ([EMAIL PROTECTED])
___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell