Re: [Haskell-cafe] How to improve speed? (MersenneTwister is several times slower than C version)

Lennart Augustsson Wed, 01 Nov 2006 19:05:21 -0800

The whole point of writing the Mersenne Twister was that I wanted toshow how a stateful computation could be encapsulated in the ST monadand none of it showing up outside. This aspect of the code istotally gone now when everything is in the IO monad. Is there somegood reason to have it in the IO monad?


        -- Lennart


On Nov 1, 2006, at 20:51 , Lemmih wrote:

On 11/1/06, isto <[EMAIL PROTECTED]> wrote:

Hi all,

On HaWiki was an announcement of MersenneTwister made by Lennart

Augustsson. On a typical run to find out 10000000th rnd num theoutput

is (code shown below):

$ time ./testMTla
Testing Mersenne Twister.
Result is [3063349438]

real    0m4.925s
user    0m4.856s


I was exercising with the very same algorithm and tried to make it

efficient (by using IOUArray): now a typical run looks like (codeshown

below):

$ time ./testMT
Testing Mersenne Twister.
3063349438

real    0m3.032s
user    0m3.004s


The original C-version (modified so that only the last number is
shown) gives typically

$ time ./mt19937ar
outputs of genrand_int32()
3063349438

real    0m0.624s
user    0m0.616s

Results are similar with 64 bit IOUArray against 64 bit C variant.
C seems to work about 5 to 10 times faster in this case.

I have tried to do different things but now I'm stuck.  unsafeRead
and unsafeWrite improved a bit the lazy (STUArray-version) and
IOUArray-versions but not very much.  I took a look of Core file but

then, I'm not sure where the boxed values are ok. E.g. shouldIOUArray

Int Word64  be replaced with something else?

Any hints and comments on how to improve the efficiency and make
everything better will be appreciated a lot!

br, Isto


Greetings,

Applying a few optimations can make it about 3x faster.

1. Hoist the array out of your loops. (See generateNumbers32,
initialiseGenerator32 and genRNums).
2. Don't create too many new MT32 boxes. Most of the time is spent in
'next32' and changing its type to 'IOUArray Int Word32 -> Int -> IO
(Word32, Int)' makes it much faster.
3. Demand more inlining. If you're using GHC,
-funfolding-use-threshold=16 will substantially improve the
performance.

Using 'seq' is generally a bad idea. It can worsen the performance if
not used carefully and GHCs strictness analyser is usually good
enough.
I used the profiler and -ddump-simpl to analyse this program.

Donald suggested manual unboxing. However, in this case it won't help
much (if at all) since GHC is doing such an excellent job on its own.

--
Cheers,
 Lemmih
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] How to improve speed? (MersenneTwister is several times slower than C version)

Reply via email to