Re: Arrays vs Lists and Specialization
Matthew Donadio [EMAIL PROTECTED] writes: OK, my question then has to do with the efficiency of lists versus arrays. Do the latest compilers handle handle arrays efficiently, or are lists really the way to go? I've currently struggled a bit with arrays. I have a list based program (calculating suffix arrays, since you ask), and since I experience a notably lower performance than array based C equivalents, I thought using arrays would help me out. Currently, I've been able to use arrays efficiently as read-only data structures. I've tried to use STArrays to do updates imperatively, but it's still slow, and uses a lot of memory (that doesn't show up in the heap profiling). I'll try to wrap more of the program in the ST monad, to see if it helps. If there is a performace difference, is it generally big enough to warrant rewriting algorithms? I think it is hard to answer that generally. For some algorithms, the benefit can be significant; it depends on your application, your data set, and your resources. But remember that correct is better than fast, and readable is better than correct. :-) -kzm -- If I haven't seen further, it is by standing in the footprints of giants ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
RE: Arrays vs Lists and Specialization
Matthew | Many spectral estimation routines are defined in terms of special | matrices (ie, Toeplitz, etc). Arrays defined recursively by list | comprehensions make it easy to implement algorithms like Levinson-Durbin | recursion, and they look very similar to the mathematical definitions: | | levinson r p = (array (1,p) [ (k, a!(p,k)) | k - [1..p] ], realPart (rho!p)) | where a = array ((1,1),(p,p)) [ ((k,i), ak k i) | k - [1..p], i - [1..k] ] |rho = array (1,p) [ (k, rhok k) | k - [1..p] ] |ak 1 1 = -r!1 / r!0 |ak k i | k==i = -(r!k + sum [ a!(k-1,l) * r!(k-l) | l - [1..(k-1)] ]) / | rho!(k-1) | | otherwise = a!(k-1,i) + a!(k,k) * (conjugate (a!(k-1,k-i))) |rhok 1 = (1 - (abs (a!(1,1)))^2) * r!0 |rhok k = (1 - (abs (a!(k,k)))^2) * rho!(k-1) | | OK, my question then has to do with the efficiency of lists versus | arrays. Do the latest compilers handle handle arrays efficiently, or | are lists really the way to go? If there is a performace difference, is | it generally big enough to warrant rewriting algorithms? Do not rewrite it! This is a textbook application of Haskell arrays, and they should work beautifully. If not, we should fix it. Using lists will probably be much *less* efficient than arrays. People often use arrays with algorithms derived from imperative programming, which assume update-in-place. They use (\\) for each element, which copies the entire array, and get terrible performance. Haskell arrays are designed to be build en-bloc, as you are doing it. Your programs are lovely, and will not do redundant array copying. It is, however, possible that GHC fails to deforest away all the intermediate lists in the array comprehensions, but I think it does a reasonable job. You can use -ddump-simpl to see. | A related question is how is specilization handled in arrays with lazy | evaluation. In the definition of levinson above, the a array is defined | in terms of the ak function. By doing this, you save some horizontal | space, but it also unburdens the programmer from tracking the recursive | dependencies. a!(k,k) is needed before a!(i,j) can be calculated, but | lazy evaluation takes care of this. Yes. The efficiency penalty is that every element is built as a thunk and only later evaluated on demand. | If the above function is | specialized for r::Array Int (Complex Double) and p::Int, would I be | correct to say that the final value of the function would be unboxed, | but all intermediate values wouldn't? No, the function returns an array, which is always boxed. Its *elements* may be unboxed, but only if you, the programmer, specify them to be unboxed. GHC will not unbox array elements automatically, because that makes them strict, and that changes the semantics. To use unboxed arrays, you need the UArray library (documented in GHC's libraries). If the algorithm does not use laziness, you can often unbox array elements just by changing Array to UArray in the types. Easy. But that makes it strict, and that may make your algorithm fail. No magic here. | Now, in some cases, a user may | need all of the model orders from 1..p. This is handled easilly enough | by just changing the first line. to | | levinson r p = (a, fmap realPart rho) | | Would the a matrix in the tuple be unboxed with specilization? Not sure what you mean here, but I don't think so. | If anyone is interesting in what I have put together, I will be making | everything public sometime next week. I have a lot of algorithms | implemented, but I need to clean up the documentation a bit (well, a | lot). Would you like to write it up as a paper for the Journal of Functional Programming? By the way, do you know of David Goblirsch's work http://delivery.acm.org/10.1145/33/326801/p425-goblirsch.pdf?key1=32 6801key2=1086223401coll=GUIDEdl=GUIDECFID=7058054CFTOKEN=55810940 I'm not sure where he is now. Simon ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: Arrays vs Lists and Specialization
Matthew Donadio: | Many spectral estimation routines are defined in terms of special | matrices (ie, Toeplitz, etc). Arrays defined recursively by list | comprehensions make it easy to implement algorithms like Levinson-Durbin | recursion, and they look very similar to the mathematical definitions: | | levinson r p = (array (1,p) [ (k, a!(p,k)) | k - [1..p] ], realPart (rho!p)) | where a = array ((1,1),(p,p)) [ ((k,i), ak k i) | k - [1..p], i - [1..k] ] | rho = array (1,p) [ (k, rhok k) | k - [1..p] ] | ak 1 1 = -r!1 / r!0 | ak k i | k==i = -(r!k + sum [ a!(k-1,l) * r!(k-l) | l - [1..(k-1)] ]) / | rho!(k-1) | | otherwise = a!(k-1,i) + a!(k,k) * (conjugate (a!(k-1,k-i))) | rhok 1 = (1 - (abs (a!(1,1)))^2) * r!0 | rhok k = (1 - (abs (a!(k,k)))^2) * rho!(k-1) | | OK, my question then has to do with the efficiency of lists versus | arrays. Do the latest compilers handle handle arrays efficiently, or | are lists really the way to go? If there is a performace difference, is | it generally big enough to warrant rewriting algorithms? Simon: Do not rewrite it! This is a textbook application of Haskell arrays, and they should work beautifully. If not, we should fix it. Using lists will probably be much *less* efficient than arrays. People often use arrays with algorithms derived from imperative programming, which assume update-in-place. They use (\\) for each element, which copies the entire array, and get terrible performance. Haskell arrays are designed to be build en-bloc, as you are doing it. Your programs are lovely, and will not do redundant array copying. It is, however, possible that GHC fails to deforest away all the intermediate lists in the array comprehensions, but I think it does a reasonable job. You can use -ddump-simpl to see. You could check out SAC. SAC is a functional language specialized for arrays, with restrictions in the language to ensure efficient compilation. The SAC compiler can apparently do a very good job on scheduling/memory allocation as to optimize for cache performance: this is an important issue to get performance in array computations that I believe current Haskell implementations do not bother much about. SAC has syntactic forms to define arrays that are somewhat reminiscent of array comprehensions, although they are much more restricted. An interesting route for a *real* array-optimizing Haskell compiler could be to try to find instances of array comprehensions that can be mapped to constructs corresponding to SAC's constructs, and then apply the compilation technology of SAC. An alternative is to make such constructs directly available in Haskell, so the programmer can be explicit whan performance is needed. Homepage http://www.informatik.uni-kiel.de/~sacbase/. Björn Lisper ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: Arrays vs Lists and Specialization
SAC has syntactic forms to define arrays that are somewhat reminiscent of array comprehensions, although they are much more restricted. An interesting route for a *real* array-optimizing Haskell compiler could be to try to find instances of array comprehensions that can be mapped to constructs corresponding to SAC's constructs, and then apply the compilation technology of SAC. An alternative is to make such constructs directly available in Haskell, so the programmer can be explicit whan performance is needed. Also related is Manuel Chakravarty et al's array fusion (http://www.cse.unsw.edu.au/~chak/papers/CK01.html) which hopefully will introduce array comprehensions into Haskell shortly (see http://www.cse.unsw.edu.au/~chak/afp02/). ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Arrays vs Lists and Specialization
Hi all, I'm sorry if this topic has been rehashed a lot, but I poked around in the mailing list archive, and didn't find what I was looking for. I currently have some free time on my hands, and have been implementing some digital signal processing and spectral/frequency estimation algorithms along with the needed matrix routines in Haskell. For those unfamiliar with this field, most algorthms are defined in textbooks in terms of indexing through discrete sequences. For example the implementation of cross-correlation rxy x y k | k = 0 = sum [ x!(i+k) * (conjugate (y!i)) | i - [0..(n-1-k)] ] | k 0 = conjugate (rxy y x (-k)) where n = snd (bounds x) + 1 looks very similar to the textbook definition if one uses arrays. A definition using lists would probably use drop, map, and zipWith, and look nothing like the definitions found in the standard texts. Many spectral estimation routines are defined in terms of special matrices (ie, Toeplitz, etc). Arrays defined recursively by list comprehensions make it easy to implement algorithms like Levinson-Durbin recursion, and they look very similar to the mathematical definitions: levinson r p = (array (1,p) [ (k, a!(p,k)) | k - [1..p] ], realPart (rho!p)) where a = array ((1,1),(p,p)) [ ((k,i), ak k i) | k - [1..p], i - [1..k] ] rho = array (1,p) [ (k, rhok k) | k - [1..p] ] ak 1 1 = -r!1 / r!0 ak k i | k==i = -(r!k + sum [ a!(k-1,l) * r!(k-l) | l - [1..(k-1)] ]) / rho!(k-1) | otherwise = a!(k-1,i) + a!(k,k) * (conjugate (a!(k-1,k-i))) rhok 1 = (1 - (abs (a!(1,1)))^2) * r!0 rhok k = (1 - (abs (a!(k,k)))^2) * rho!(k-1) This could be rewritten for lists, but would probably need to be defined in terms of an aux. recursive function, which destroys the simplicity of the above definition. OK, my question then has to do with the efficiency of lists versus arrays. Do the latest compilers handle handle arrays efficiently, or are lists really the way to go? If there is a performace difference, is it generally big enough to warrant rewriting algorithms? A related question is how is specilization handled in arrays with lazy evaluation. In the definition of levinson above, the a array is defined in terms of the ak function. By doing this, you save some horizontal space, but it also unburdens the programmer from tracking the recursive dependencies. a!(k,k) is needed before a!(i,j) can be calculated, but lazy evaluation takes care of this. If the above function is specialized for r::Array Int (Complex Double) and p::Int, would I be correct to say that the final value of the function would be unboxed, but all intermediate values wouldn't? Now, in some cases, a user may need all of the model orders from 1..p. This is handled easilly enough by just changing the first line. to levinson r p = (a, fmap realPart rho) Would the a matrix in the tuple be unboxed with specilization? If anyone is interesting in what I have put together, I will be making everything public sometime next week. I have a lot of algorithms implemented, but I need to clean up the documentation a bit (well, a lot). Thanks. -- Matthew Donadio ([EMAIL PROTECTED]) ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell