On 11/05/2010, at 05:29, Edward Z. Yang wrote:

> Some of the important primitives offered by Data Parallel Haskell are
> reduction primitives such as sumP and prodP, which take a data parallel
> array and reduce it to a single value.  I was wondering what the current
> capabilities for end-users interested in implementing their own
> reduction primitives were.  In particular, if I want to run data parallel
> computations on strings, I will generally want a more exotic set of
> combining operators.
> 
> thoughtpolice informed me that GHC 6.10 seemed to have sumP/prodP hard
> coded into the vectorisation and optimisation stages, so this didn't
> really seem possible in userspace.  I'm interested to know if this situation
> has changed.  No hard feelings if it hasn't; I'm really just playing around
> with DPH and seeing what it can do. :-)

Short answer: we will eventually allow arbitrary operators in folds and scans 
but we need some time to figure out how to do this efficiently and it's not 
really on the top of our todo list at the moment.

That said, there are no theoretical problems with allowing user-defined 
operators and we could, with some work, provide support for them now. The only 
real difficulty is efficiency. Consider than in (foldP f xs) the function f 
might itself do something in parallel. In that case, we have to eliminate the 
nesting. This is quite straightforward to do by splitting xs in two halves, ys 
and zs, and then computing (zipWithP f ys zs). Then, we split the result, zip 
again and so on until we are left with a one-element array. This works because 
zipWithP "knows" how to eliminate nested parallelism. In fact, we have just 
directly encoded the parallel tree reduction algorithm.

We can do *much* better if we know that f is purely sequential, though (e.g., 
if f is Int addition). Now, we just split xs into chunks, one per thread, 
compute the local sums (a tight, sequential loop on each thread) and then 
combine them. This is a huge performance win.

For now, we have just provided specialised folds which we know are fast and 
which are easy to implement. When the rest of the system works reliably, we'll 
think about how to implement the general fold combinator with good performance 
for purely sequential combining operators.

Roman


_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Reply via email to