Re: [Haskell-cafe] bytestring vs. uvector

Claus Reinke Sun, 08 Mar 2009 17:48:12 -0700

uvector is, if my memory serves me correctly, a fork of the vector library.
It uses modern stream fusion, but is under active development and is a
little scary. I'm a little unclear on the exact difference between uvector
and vector. Both use arrays that are not pinned, so they can't be readily
used with foreign code. If you want to use either library, understand that
you're embarking on a bracing adventure.


vector and uvector are roughly based on the same technology; uvector
is - as far as I remember - a fork of some of the old DPH code which
uses stream fusion which Don cleaned up and worked on (and it's proven
pretty useful, and people are still hacking on it.)

vector however, has the notion of 'recycling arrays' when it does
array operations. The technique is in fact quite similar to stream
fusion. Roman L. built this from scratch I think, so it's quite a bit
more unused and less stable than even uvector is maybe, but I suppose
you could say it's kind of a superset of uvector. Hopefully though
it should mature a little, and the plans are to have the technology
from both of these folded into the Data Parallel Haskell project so we
get fast array operations+automatic parallelisation.

For info, see Roman's paper, 'Recycle your arrays!'

http://www.cse.unsw.edu.au/~rl/publications/recycling.html


Given the close relationship between uvector and vector, it would

be very helpful if both package descriptions on hackage couldpoint to a common haskell wiki page, starting out with the textand link above, plus a link to the stream fusion paper (I hadn'tbeen aware that vector incorporates the recycling work, andhad often wondered about the precise relationship between those

two packages). Apart from saving others from similar confusion,
that would also provide a place to record experience with those two 
alternatives.

Btw, have any of the Haskell array optimization researchers

considered fixpoints yet? Both fusion and recycling are basedon rewrite rules of the kind "in . out --> id". Now, given a looplike this:


   loop a = if c a then loop (out (action (in a))) else a
   loop a

these rules don't apply. Unrolling the loop a fixed number of
times would enable some rule applications, but still some would
remain in the loop body. But with a little rewriting

   loop a = if c a then loop (out (action (in a))) else out (id (in a))
   loop a

   loop a = if c a then loop (out (action (in a))) else out (id (in a))
   (if c a then loop (out (action (in a))) else out (id (in a)))

we can now push the out into the next iteration of the loop or,
if there is no next iteration, into the loop epilogue

   loop a = if c (out a) then loop (action (in (out a))) else id (in (out a))
   out (if c a then loop (action (in a)) else a)

making the rewrite rule applicable

   loop a = if c (out a) then loop (action a) else id a
   out (if c a then loop (action (in a)) else a)

leading (modulo bugs, omissions, and oversights;-) to a fused/
recycled loop body, with potentially substantial benefit.

Claus

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

Reply via email to