[Haskell-cafe] You are in a twisty maze of concurrency libraries, all different ...
I'm looking for the right concurrency library/semantics for what should be a reasonably simple problem. I have a little simulator: runWorldSim :: MTGen - SimState - IO SimState it takes about a second to run on a PC. It's functional except it whacks the rng, which needs IO. I run 5-10 of these jobs, and then use: mergeWorld :: [SimState] - SimState to pick the best features of the runs and build another possible world (state). Then I use this new world to run another 5-10 jobs and so on. I run this through ~2 iterations. It's an obvious place for parallelism. I'm looking for a concurrency library with something like: forkSequence :: Int - [IO a] - IO [a] which I could call with something like this: forkSequence 4 (take 10 (repeat (runWorldSim g ss))) this would construct 4 threads, then dispatch the 10 jobs onto the threads, and pack up the results into a list I could run through my merger. It strikes me as something someone would already have done, but I can't find anything in hackage. Probably I've missed something obvious? Any pointers? If not, what would be the best/easiest existing package to write an extension to? Thanks, Patrick. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] You are in a twisty maze of concurrency libraries, all different ...
Patrick Caldon p...@pessce.net writes: it takes about a second to run on a PC. It's functional except it whacks the rng, which needs IO. I run 5-10 of these jobs, and then use: Which RNG are you using that it needs so much IO? mergeWorld :: [SimState] - SimState to pick the best features of the runs and build another possible world (state). Then I use this new world to run another 5-10 jobs and so on. I run this through ~2 iterations. It's an obvious place for parallelism. I'm looking for a concurrency library with something like: forkSequence :: Int - [IO a] - IO [a] which I could call with something like this: forkSequence 4 (take 10 (repeat (runWorldSim g ss))) this would construct 4 threads, then dispatch the 10 jobs onto the threads, and pack up the results into a list I could run through my merger. It strikes me as something someone would already have done, but I can't find anything in hackage. Probably I've missed something obvious? Any pointers? If not, what would be the best/easiest existing package to write an extension to? Thanks, Patrick. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe -- Ivan Lazar Miljenovic ivan.miljeno...@gmail.com IvanMiljenovic.wordpress.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] You are in a twisty maze of concurrency libraries, all different ...
Ivan Lazar Miljenovic wrote: Patrick Caldon p...@pessce.net writes: it takes about a second to run on a PC. It's functional except it whacks the rng, which needs IO. I run 5-10 of these jobs, and then use: Which RNG are you using that it needs so much IO? Mersenne Twister, System.Random.Mersenne. The ordinary rng kills performance. Patrick. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] You are in a twisty maze of concurrency libraries, all different ...
Patrick Caldon wrote: I'm looking for the right concurrency library/semantics for what should be a reasonably simple problem. I have a little simulator: runWorldSim :: MTGen - SimState - IO SimState it takes about a second to run on a PC. It's functional except it whacks the rng, which needs IO. I run 5-10 of these jobs, and then use: mergeWorld :: [SimState] - SimState to pick the best features of the runs and build another possible world (state). Then I use this new world to run another 5-10 jobs and so on. I run this through ~2 iterations. It's an obvious place for parallelism. I'm looking for a concurrency library with something like: forkSequence :: Int - [IO a] - IO [a] which I could call with something like this: forkSequence 4 (take 10 (repeat (runWorldSim g ss))) this would construct 4 threads, then dispatch the 10 jobs onto the threads, and pack up the results into a list I could run through my merger. Why particularly do you want to run the 10 jobs on 4 threads? Haskell's run-time is quite good at spreading out the lightweight threads onto all your cores, so the easiest thing to do is run the 10 jobs on 10 (light-weight) threads and let the run-time sort out the rest. So if what you want is a function: runPar :: [IO a] - IO [a] you can easily construct this. Shameless plug: my CHP library effectively has this function already, runParallel :: [CHP a] - CHP [a] (CHP being a slight layer on top of IO). But you can do it just as easily with, say, STM. Here is a version where order doesn't matter (apologies for the point-free style): import Control.Concurrent import Control.Concurrent.STM import Control.Monad modifyTVar :: TVar a - (a - a) - STM () modifyTVar tv f = readTVar tv = writeTVar tv . f runPar :: [IO a] - IO [a] runPar ps = do resVar - newTVarIO [] mapM_ (forkIO . (= atomically . modifyTVar resVar . (:))) ps atomically $ do res - readTVar resVar when (length res length ps) retry return res If order does matter, you can zip the results with an index, and sort by the index afterwards. If efficiency matters, you can perform other tweaks. But the principle is quite straightforward. Or you can refactor your code to take the IO dependency out of your random number generation, and run the sets of pure code in parallel using the parallel library. If all you are using IO for is random numbers, that's probably the nicest approach. Thanks, Neil. P.S. take 10 . repeat is the same as replicate 10 ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] You are in a twisty maze of concurrency libraries, all different ...
Neil Brown wrote: Patrick Caldon wrote: I'm looking for the right concurrency library/semantics for what should be a reasonably simple problem. I have a little simulator: runWorldSim :: MTGen - SimState - IO SimState it takes about a second to run on a PC. It's functional except it whacks the rng, which needs IO. I run 5-10 of these jobs, and then use: mergeWorld :: [SimState] - SimState to pick the best features of the runs and build another possible world (state). Then I use this new world to run another 5-10 jobs and so on. I run this through ~2 iterations. It's an obvious place for parallelism. I'm looking for a concurrency library with something like: forkSequence :: Int - [IO a] - IO [a] which I could call with something like this: forkSequence 4 (take 10 (repeat (runWorldSim g ss))) this would construct 4 threads, then dispatch the 10 jobs onto the threads, and pack up the results into a list I could run through my merger. Why particularly do you want to run the 10 jobs on 4 threads? Haskell's run-time is quite good at spreading out the lightweight threads onto all your cores, so the easiest thing to do is run the 10 jobs on 10 (light-weight) threads and let the run-time sort out the rest. Thanks so much for that! I'll give it a go. Different threads is just because some of the jobs are memory hogs, and I want to minimize the number running simultaneously. I'll see what happens with a runPar-like approach, and use a queue-based approach if it becomes a problem. So if what you want is a function: runPar :: [IO a] - IO [a] you can easily construct this. Shameless plug: my CHP library effectively has this function already, runParallel :: [CHP a] - CHP [a] (CHP being a slight layer on top of IO). But you can do it just as easily with, say, STM. Here is a version where order doesn't matter (apologies for the point-free style): import Control.Concurrent import Control.Concurrent.STM import Control.Monad modifyTVar :: TVar a - (a - a) - STM () modifyTVar tv f = readTVar tv = writeTVar tv . f runPar :: [IO a] - IO [a] runPar ps = do resVar - newTVarIO [] mapM_ (forkIO . (= atomically . modifyTVar resVar . (:))) ps atomically $ do res - readTVar resVar when (length res length ps) retry return res If order does matter, you can zip the results with an index, and sort by the index afterwards. If efficiency matters, you can perform other tweaks. But the principle is quite straightforward. Or you can refactor your code to take the IO dependency out of your random number generation, and run the sets of pure code in parallel using the parallel library. If all you are using IO for is random numbers, that's probably the nicest approach. Good, fast random numbers are unfortunately necessary - I had a nice implementation using System.Random, but had to rewrite it because performance was poor :( . P.S. take 10 . repeat is the same as replicate 10 Thanks again! Patrick. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] You are in a twisty maze of concurrency libraries, all different ...
On Fri, 2009-12-04 at 22:51 +1100, Patrick Caldon wrote: I'm looking for the right concurrency library/semantics for what should be a reasonably simple problem. I have a little simulator: runWorldSim :: MTGen - SimState - IO SimState it takes about a second to run on a PC. It's functional except it whacks the rng, which needs IO. Wait! This is not going to work! You cannot use the MTGen from the mersenne-random in a concurrent IO program because the C code uses a single global mutable RNG state. Your independent simulations would not be independent and you would not get reproducible results. Indeed you could get incorrect results or segfaults because the C code does not expect to be called from multiple threads simultaneously (there is no locking). Personally I would attack this by eliminating the IO. There's no justification for a random number generator being in IO. And look at the problems it causes! There are other MT implementations that do not use C code which assumes it's ok to use one single global mutable RNG state for an entire process. There are pure-Haskell MT impls that use mutable variables in ST but give an overall pure lazy list of random numbers. If you don't need MT specifically then there are other fast RNGs too. Duncan ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] You are in a twisty maze of concurrency libraries, all different ...
On 04/12/09 11:51, Patrick Caldon wrote: I'm looking for the right concurrency library/semantics for what should be a reasonably simple problem. I have a little simulator: runWorldSim :: MTGen - SimState - IO SimState it takes about a second to run on a PC. It's functional except it whacks the rng, which needs IO. I run 5-10 of these jobs, and then use: mergeWorld :: [SimState] - SimState to pick the best features of the runs and build another possible world (state). Then I use this new world to run another 5-10 jobs and so on. I run this through ~2 iterations. It's an obvious place for parallelism. If you can get rid of the need for IO then you can use Control.Parallel to evaluate pure functions instead. If you only use IO for the random numbers then you can either keep a StdGen in your SimState or else use a State StdGen monad. Since your random number use is presumably already in monadic IO you could probably switch to a state monad fairly trivially. Paul. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] You are in a twisty maze of concurrency libraries, all different ...
On Fri, Dec 4, 2009 at 7:38 AM, Duncan Coutts duncan.cou...@googlemail.comwrote: Wait! This is not going to work! You cannot use the MTGen from the mersenne-random in a concurrent IO program because the C code uses a single global mutable RNG state. So use the PRNG in the statistics package instead. It's got some nice features that make it a better choice than mersenne-random for essentially all uses: - Faster than mersenne-random - State is encapsulated, so you can have independent PRNGs in different threads or different library modules - You can easily seed independent generators from your system's high-quality PRNG It can also generate normally distributed numbers as well as uniformly distributed numbers (which is all that mersenne-random gives you), and it uses a high-quality fast algorithm for the normal distribution, rather than the usual ziggurat which is somewhat broken. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe