Re: [Haskell-cafe] Why isn't there a cheaper "split-in-two" operation for Data.Set?

wren ng thornton Tue, 19 Oct 2010 20:06:01 -0700

On 10/19/10 5:47 AM, Ryan Newton wrote:

That sounds good to me.  In any case the parallel map/fold operations by
themselves shouldn't compromise the abstraction.


Perhaps an eventual solution would be to start including parallel maps/folds
right inside the standard libraries.  I haven't began testing this yet but
it would seem that all the balanced tree implementations are good candidates
for a little `par` treatment.  Has this been tried somewhere already?
    Alas, having par/nonpar versions of library functions compounds the
already present monadic/non-monadic schism...

Another problem is that the desirable level of parallelization isn'tfixed. For example, let's consider a function f being mapped over acollection C.

With non-parallel map this has cost O(0*k) + O(C)*O(f) where k is thecost of spawning/reaping threads, O(C) is the size of the collection,and O(f) is the cost of running f once.

Now, consider mapPar2 which uses two threads. The cost of this is O(1*k)+ 2 `par` O(C)/2*O(f), where `par` is a kind of multiplication based onthe level of parallelism actually achieved. With perfect paralellismx`par`y = y; with no parallelism x`par`y = x*y.

We can generalize this to O((n-1)*k) + n `par` O(C)/n*O(f) for nthreads. But the problem is, depending on how big O(f) is relative toO(k), there's going to be a different n which gives the optimaltradeoff. If O(f) is small enough, then the overhead of sparking threadsis wasted; if O(f) is middling, then we'd want a few threads but not"too many"; if O(f) is huge, then we want n = O(C) since the overheaddisappears in the asymptotics. Thus, what we'd need to offer in the APIis a set of evaluators in order to alter the level of parallelization.


--
Live well,
~wren
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Why isn't there a cheaper "split-in-two" operation for Data.Set?

Reply via email to