[Haskell-cafe] Generate random UArray in constant memory space.

2010-02-09 Thread Vasyl Pasternak
Hello Cafe,

I tried to generate memory-efficient list of random numbers, so I've
used uvector library for this task. But it doesn't work,
it still allocates more than 6Gb of memory for the random list of 10
million elements. Here is the code:

> import Text.Printf
> import System.Random
> import Control.Applicative
> import System.Environment
> import Data.Array.Vector

> randomListU :: (Int, Int) -> StdGen -> Int -> (UArr Int)
> randomListU b g size = unfoldU size gen g
>   where gen g = let (x, g') = randomR b g
> in JustS (x :*: g')

> main = do
>   [size] <- map read <$> getArgs
>   let ints = randomListU (-10, 10) (mkStdGen 1) size
>   printf "%d\n"  (sumU ints)

Could someone give a hint, how to implement this function in constant
memory space?

Thank you in advance.

Best regards,
Vasyl Pasternak
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Generate random UArray in constant memory space.

2010-02-09 Thread Daniel Fischer
Am Dienstag 09 Februar 2010 13:18:23 schrieb Vasyl Pasternak:
> Hello Cafe,
>
> I tried to generate memory-efficient list of random numbers, so I've
> used uvector library for this task. But it doesn't work,
> it still allocates more than 6Gb of memory for the random list of 10
>
> million elements. Here is the code:

Hmm,

$ ghc -O2 --make ranVec
[1 of 1] Compiling Main ( ranVec.hs, ranVec.o )
Linking ranVec ...
$ ./ranVec 1000 +RTS -sstderr
5130
   4,919,912,080 bytes allocated in the heap
 883,256 bytes copied during GC
  26,896 bytes maximum residency (1 sample(s))
  25,620 bytes maximum slop
   1 MB total memory in use (0 MB lost due to fragmentation)

maximum residency is just eight bytes more than for 100,000 or 1,000,000 
numbers. I think that is constant space.

The ~5 GB total allocation is sequential, ten million new StdGens are 
produced and allocated, then immediately garbage collected. I see no 
problem (except that StdGen is slow, e.g. the Mersenne twister is much 
faster [and allocates less, but still linear in size]).

> > import Text.Printf
> > import System.Random
> > import Control.Applicative
> > import System.Environment
> > import Data.Array.Vector
> >
> > randomListU :: (Int, Int) -> StdGen -> Int -> (UArr Int)
> > randomListU b g size = unfoldU size gen g
> >   where gen g = let (x, g') = randomR b g
> > in JustS (x :*: g')
> >
> > main = do
> >   [size] <- map read <$> getArgs
> >   let ints = randomListU (-10, 10) (mkStdGen 1) size
> >   printf "%d\n"  (sumU ints)
>
> Could someone give a hint, how to implement this function in constant
> memory space?
>
> Thank you in advance.
>
> Best regards,
> Vasyl Pasternak

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Generate random UArray in constant memory space.

2010-02-09 Thread Vasyl Pasternak
Sorry, maybe I should ask more clearer.

I've looked at dons article "Haskell as fast as C"[1], and tried to
implement similar algorithm but for list of random numbers.

Please look at code:

> import Text.Printf
> import Control.Applicative
> import System.Environment
> import Data.Array.Vector

> main = do
>   [size] <- map read <$> getArgs
>   let ints = enumFromToU 0 size :: UArr Int
>   printf "%d\n"  (sumU ints)

This code runs in constant space (on my pc ~25kb allocates on the
heap) regardless of array size. So I tried to achieve similar with
random list, just to replace `enumFromToU` with my own list generator.

So the question - is it possible to implement random list similary to
enumFromToU?


[1]http://donsbot.wordpress.com/2008/06/04/haskell-as-fast-as-c-working-at-a-high-altitude-for-low-level-performance/

Thank you,
Vasyl Pasternak

2010/2/9 Daniel Fischer :
> Am Dienstag 09 Februar 2010 13:18:23 schrieb Vasyl Pasternak:
>> Hello Cafe,
>>
>> I tried to generate memory-efficient list of random numbers, so I've
>> used uvector library for this task. But it doesn't work,
>> it still allocates more than 6Gb of memory for the random list of 10
>>
>> million elements. Here is the code:
>
> Hmm,
>
> $ ghc -O2 --make ranVec
> [1 of 1] Compiling Main             ( ranVec.hs, ranVec.o )
> Linking ranVec ...
> $ ./ranVec 1000 +RTS -sstderr
> 5130
>   4,919,912,080 bytes allocated in the heap
>         883,256 bytes copied during GC
>          26,896 bytes maximum residency (1 sample(s))
>          25,620 bytes maximum slop
>               1 MB total memory in use (0 MB lost due to fragmentation)
>
> maximum residency is just eight bytes more than for 100,000 or 1,000,000
> numbers. I think that is constant space.
>
> The ~5 GB total allocation is sequential, ten million new StdGens are
> produced and allocated, then immediately garbage collected. I see no
> problem (except that StdGen is slow, e.g. the Mersenne twister is much
> faster [and allocates less, but still linear in size]).
>
>> > import Text.Printf
>> > import System.Random
>> > import Control.Applicative
>> > import System.Environment
>> > import Data.Array.Vector
>> >
>> > randomListU :: (Int, Int) -> StdGen -> Int -> (UArr Int)
>> > randomListU b g size = unfoldU size gen g
>> >   where gen g = let (x, g') = randomR b g
>> >                 in JustS (x :*: g')
>> >
>> > main = do
>> >   [size] <- map read <$> getArgs
>> >   let ints = randomListU (-10, 10) (mkStdGen 1) size
>> >   printf "%d\n"  (sumU ints)
>>
>> Could someone give a hint, how to implement this function in constant
>> memory space?
>>
>> Thank you in advance.
>>
>> Best regards,
>> Vasyl Pasternak
>
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Generate random UArray in constant memory space.

2010-02-09 Thread Daniel Fischer
Am Dienstag 09 Februar 2010 14:51:31 schrieb Vasyl Pasternak:
> Sorry, maybe I should ask more clearer.
>
> I've looked at dons article "Haskell as fast as C"[1], and tried to
> implement similar algorithm but for list of random numbers.
>
> Please look at code:
> > import Text.Printf
> > import Control.Applicative
> > import System.Environment
> > import Data.Array.Vector
> >
> > main = do
> >   [size] <- map read <$> getArgs
> >   let ints = enumFromToU 0 size :: UArr Int
> >   printf "%d\n"  (sumU ints)
>
> This code runs in constant space (on my pc ~25kb allocates on the
> heap) regardless of array size.

The random number thing runs in constant space, too.
The difference is that the enumFromToU produces a tight loop with variables 
which never leave the registers, while in the random number thing at least 
the StdGens are allocated in the heap (the produced Ints may stay in the 
registers, too, I don't know).
But enumFromToU and friends were written for that to happen, there are a 
ton of rewrite rules to help the compiler create tight loops. StdGen has no 
such thing, randomR has no {-# INLINE #-} pragma, so you have a call to 
randomR (well, to randomIValInteger, actually) for each element.
Maybe if you had the source for the PRNG in the same file, it could be 
inlined to give an allocation-free loop.

> So I tried to achieve similar with
> random list, just to replace `enumFromToU` with my own list generator.
>
> So the question - is it possible to implement random list similary to
> enumFromToU?
>
>
> [1]http://donsbot.wordpress.com/2008/06/04/haskell-as-fast-as-c-working-
>at-a-high-altitude-for-low-level-performance/
>
> Thank you,
> Vasyl Pasternak
>
> 2010/2/9 Daniel Fischer :
> > Am Dienstag 09 Februar 2010 13:18:23 schrieb Vasyl Pasternak:
> >> Hello Cafe,
> >>
> >> I tried to generate memory-efficient list of random numbers, so I've
> >> used uvector library for this task. But it doesn't work,
> >> it still allocates more than 6Gb of memory for the random list of 10
> >>
> >> million elements. Here is the code:
> >
> > Hmm,
> >
> > $ ghc -O2 --make ranVec
> > [1 of 1] Compiling Main             ( ranVec.hs, ranVec.o )
> > Linking ranVec ...
> > $ ./ranVec 1000 +RTS -sstderr
> > 5130
> >   4,919,912,080 bytes allocated in the heap
> >         883,256 bytes copied during GC
> >          26,896 bytes maximum residency (1 sample(s))
> >          25,620 bytes maximum slop
> >               1 MB total memory in use (0 MB lost due to
> > fragmentation)
> >
> > maximum residency is just eight bytes more than for 100,000 or
> > 1,000,000 numbers. I think that is constant space.
> >
> > The ~5 GB total allocation is sequential, ten million new StdGens are
> > produced and allocated, then immediately garbage collected. I see no
> > problem (except that StdGen is slow, e.g. the Mersenne twister is much
> > faster [and allocates less, but still linear in size]).
> >
> >> > import Text.Printf
> >> > import System.Random
> >> > import Control.Applicative
> >> > import System.Environment
> >> > import Data.Array.Vector
> >> >
> >> > randomListU :: (Int, Int) -> StdGen -> Int -> (UArr Int)
> >> > randomListU b g size = unfoldU size gen g
> >> >   where gen g = let (x, g') = randomR b g
> >> >                 in JustS (x :*: g')
> >> >
> >> > main = do
> >> >   [size] <- map read <$> getArgs
> >> >   let ints = randomListU (-10, 10) (mkStdGen 1) size
> >> >   printf "%d\n"  (sumU ints)
> >>
> >> Could someone give a hint, how to implement this function in constant
> >> memory space?
> >>
> >> Thank you in advance.
> >>
> >> Best regards,
> >> Vasyl Pasternak

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Generate random UArray in constant memory space.

2010-02-09 Thread Daniel Fischer
Am Tuesday 09 February 2010 15:43:13 schrieben Sie:
> Update:
>
> I've implemented `enumFromToU` through `unfoldU`:
> > enumFromToU' from to = unfoldU (to - from) f from
> >where f i = let i' = i + 1 in JustS (i' :*: i')
>
> This code behaves similarly to `enumFromToU` (i.e. constantly uses ~25
> kb of memory, runs in the same time as above).

Wait,

$ cat venum2.hs 
module Main (main) where
   

import Text.Printf
import Control.Applicative
import System.Environment 
import Data.Array.Vector  

main = do
  [size] <- map read <$> getArgs
  let ints = enumFromToU' 0 size :: UArr Int
  printf "%d\n"  (sumU ints)

enumFromToU' from to = unfoldU (to - from) f from
   where f i = let i' = i + 1 in JustS (i' :*: i')

$ ghc -O2 --make venum2
[1 of 1] Compiling Main ( venum2.hs, venum2.o )
Linking venum2 ... 
$ ./venum2 +RTS -sstderr -RTS 100
./venum2 100 +RTS -sstderr  
 
1784293664  
 
  48,256,384 bytes allocated in the heap
 
   6,256 bytes copied during GC 
 
  26,804 bytes maximum residency (1 sample(s))  
 
  25,524 bytes maximum slop 
 
   1 MB total memory in use (0 MB lost due to fragmentation)

$ ./venum2 +RTS -sstderr -RTS 1000
./venum2 1000 +RTS -sstderr
-2004260032
 481,937,552 bytes allocated in the heap
  19,516 bytes copied during GC
  26,804 bytes maximum residency (1 sample(s))
  25,512 bytes maximum slop
   1 MB total memory in use (0 MB lost due to fragmentation)

So we have constant memory, but linear allocation, just like with the 
random numbers.

With enumFromToU, also the allocation is constant, so unfoldU allocates, 
enumFromToU not.

>
> So the question is why random number list generator eats O(n) memory ?

It doesn't, not here, at least. Using System.Random, the allocation figures 
are about ten times as high, but the residency is about the same.
Putting the PRNG code in the same file and tweaking things a bit 
(eliminating all intermediate Integers, e.g.), I get

$ cat ran2Vec.hs
{-# LANGUAGE BangPatterns #-}
module Main (main) where

import Text.Printf
import Control.Applicative
import System.Environment
import Data.Array.Vector

randomListU :: (Int, Int) -> StdGen -> Int -> (UArr Int)
randomListU b@(l,h) g size = unfoldU size gen g
  where
!k = h-l+1
!b' = 2147483561 `rem` k
gen !g = let (!x, !g') = stdNext g
 in JustS ((l+ (x+b') `rem` k) :*: g')

main = do
  [size] <- map read <$> getArgs
  let ints = randomListU (-10, 10) (mkStdGen 1) size
  printf "%d\n"  (sumU ints)

data StdGen = StdGen {-# UNPACK #-} !Int {-# UNPACK #-} !Int

mkStdGen :: Int -> StdGen
mkStdGen s
 | s < 0 = mkStdGen (-s)
 | otherwise = StdGen (s1+1) (s2+1)
  where
(q, s1) = s `divMod` 2147483562
s2  = q `mod` 2147483398

{-# INLINE stdNext #-}
stdNext :: StdGen -> (Int, StdGen)
-- Returns values in the range stdRange
stdNext (StdGen s1 s2) = z' `seq` g' `seq` (z', g')
where
!g'   = StdGen s1'' s2''
!z'   = if z < 1 then z + 2147483562 else z
!z= s1'' - s2''

!k= s1 `quot` 53668
!s1'  = 40014 * (s1 - k * 53668) - k * 12211
!s1'' = if s1' < 0 then s1' + 2147483563 else s1'

!k'   = s2 `quot` 52774
!s2'  = 40692 * (s2 - k' * 52774) - k' * 3791
!s2'' = if s2' < 0 then s2' + 2147483399 else s2'
$ ghc -O2 -funbox-strict-fields -funfolding-use-threshold=16  --make 
ran2Vec -o ran2AVec5
Linking ran2AVec5 ...
$ ./ran2AVec5 +RTS -sstderr -RTS 1000
./ran2AVec5 1000 +RTS -sstderr
5130
 521,828,888 bytes allocated in the heap
   8,664 bytes copied during GC
  26,788 bytes maximum residency (1 sample(s))
  25,636 bytes maximum slop
   1 MB total memory in use (0 MB lost due to fragmentation)

nearly the same allocation figures as for the *much* simpler enumFromToU', 
and it's about ten times as fast as System.Random.

>
> Regards,
> Vasyl Pasternak
>
> 2010/2/9 Vasyl Pasternak :
> > Sorry, maybe I should ask more clearer.
> >
> > I've looked at dons article "Haskell as fast as C"[1], and tried to
> > implement similar algorithm but for list of random numbers.
> >
> > Please look at code:
> >> import Text.Printf
> >> import Control.Applicative
> >> import System.Environment
> >> import Data.Array.Vector
> >>
> >> main = do
> >>   [size] <- map read <$> getArgs
> >>   let ints = enumFromToU 0 size :: UArr Int
> >>   printf "%d\n"  (sumU ints)
> >
> > This code runs in constant space (on my pc ~2

Re: [Haskell-cafe] Generate random UArray in constant memory space.

2010-02-09 Thread Bryan O'Sullivan
On Tue, Feb 9, 2010 at 4:18 AM, Vasyl Pasternak
wrote:

>
> I tried to generate memory-efficient list of random numbers, so I've
> used uvector library for this task.


Use the mwc-random package. It provides a function that does exactly this,
and produces better quality random numbers with much higher performance
(1000x faster) than System.Random or even mersenne-random.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Generate random UArray in constant memory space.

2010-02-09 Thread Daniel Fischer
Am Tuesday 09 February 2010 19:19:18 schrieben Sie:
> Daniel,
>
> I've just run venum2 program locally and here is my results:
>
> $ ./venum2 1000 +RTS -s
> ./venum2 1000 +RTS -s
> 500500
>   22,736 bytes allocated in the heap
>  688 bytes copied during GC
>   17,184 bytes maximum residency (1 sample(s))
>   19,680 bytes maximum slop
>1 MB total memory in use (0 MB lost due to fragmentation)
>
> $ ./venum2 10 +RTS -s
> ./venum2 10 +RTS -s
> 55
>   24,152 bytes allocated in the heap
>  688 bytes copied during GC
>   17,184 bytes maximum residency (1 sample(s))
>   19,680 bytes maximum slop
>1 MB total memory in use (0 MB lost due to fragmentation)
>
> So my PC shows that there is constant memory allocation. Maybe I'm
> doing something wrong ?

Unlikely.
However, I notice that you seem to have a 64-bit system.
I don't know the details, but usually those have more registers than a 32-
bit system, so you can get more complicated loops to run completely in the 
registers.
I think that's what happens here, enumFromToU' is just complicated enough 
to not run in the registers on my 32-bit system, but still runs in the 
registers on your 64-bit system.

The PRNG code is too complicated (too many temporary variables) to run in 
the registers on either system (BTW, have you tried it with the specialised 
PRNG code in the source file? I'm not sure whether that might be just small 
enough to run in the registers on a 64-bit system.).

>
> 2010/2/9 Daniel Fischer :
> > Am Tuesday 09 February 2010 15:43:13 schrieben Sie:
> >> Update:
> >>
> >> I've implemented `enumFromToU` through `unfoldU`:
> >> > enumFromToU' from to = unfoldU (to - from) f from
> >> >    where f i = let i' = i + 1 in JustS (i' :*: i')
> >>
> >> This code behaves similarly to `enumFromToU` (i.e. constantly uses
> >> ~25 kb of memory, runs in the same time as above).
> >
> > Wait,
> >
> > $ cat venum2.hs
> > module Main (main) where
> >
> > import Text.Printf
> > import Control.Applicative
> > import System.Environment
> > import Data.Array.Vector
> >
> > main = do
> >   [size] <- map read <$> getArgs
> >   let ints = enumFromToU' 0 size :: UArr Int
> >   printf "%d\n"  (sumU ints)
> >
> > enumFromToU' from to = unfoldU (to - from) f from
> >    where f i = let i' = i + 1 in JustS (i' :*: i')
> >
> > $ ghc -O2 --make venum2
> > [1 of 1] Compiling Main             ( venum2.hs, venum2.o )
> > Linking venum2 ...
> > $ ./venum2 +RTS -sstderr -RTS 100
> > ./venum2 100 +RTS -sstderr
> > 1784293664
> >      48,256,384 bytes allocated in the heap
> >           6,256 bytes copied during GC
> >          26,804 bytes maximum residency (1 sample(s))
> >          25,524 bytes maximum slop
> >               1 MB total memory in use (0 MB lost due to
> > fragmentation)
> >
> > $ ./venum2 +RTS -sstderr -RTS 1000
> > ./venum2 1000 +RTS -sstderr
> > -2004260032
> >     481,937,552 bytes allocated in the heap
> >          19,516 bytes copied during GC
> >          26,804 bytes maximum residency (1 sample(s))
> >          25,512 bytes maximum slop
> >               1 MB total memory in use (0 MB lost due to
> > fragmentation)
> >
> > So we have constant memory, but linear allocation, just like with the
> > random numbers.
> >
> > With enumFromToU, also the allocation is constant, so unfoldU
> > allocates, enumFromToU not.
> >
> >> So the question is why random number list generator eats O(n) memory
> >> ?
> >
> > It doesn't, not here, at least. Using System.Random, the allocation
> > figures are about ten times as high, but the residency is about the
> > same. Putting the PRNG code in the same file and tweaking things a bit
> > (eliminating all intermediate Integers, e.g.), I get
> >
> > $ cat ran2Vec.hs
> > {-# LANGUAGE BangPatterns #-}
> > module Main (main) where
> >
> > import Text.Printf
> > import Control.Applicative
> > import System.Environment
> > import Data.Array.Vector
> >
> > randomListU :: (Int, Int) -> StdGen -> Int -> (UArr Int)
> > randomListU b@(l,h) g size = unfoldU size gen g
> >   where
> >    !k = h-l+1
> >    !b' = 2147483561 `rem` k
> >    gen !g = let (!x, !g') = stdNext g
> >             in JustS ((l+ (x+b') `rem` k) :*: g')
> >
> > main = do
> >   [size] <- map read <$> getArgs
> >   let ints = randomListU (-10, 10) (mkStdGen 1) size
> >   printf "%d\n"  (sumU ints)
> >
> > data StdGen = StdGen {-# UNPACK #-} !Int {-# UNPACK #-} !Int
> >
> > mkStdGen :: Int -> StdGen
> > mkStdGen s
> >  | s < 0     = mkStdGen (-s)
> >  | otherwise = StdGen (s1+1) (s2+1)
> >      where
> >    (q, s1) = s `divMod` 2147483562
> >    s2      = q `mod` 2147483398
> >
> > {-# INLINE stdNext #-}
> > stdNext :: StdGen -> (Int, StdGen)
> > -- Returns values in the range stdRange
> > stdNext (StdGen s1 s2) = z' `seq` g' `seq` (z', g')
> >    where
> >        !g'   = StdGen s1'' s2''
> >        !z'   = if z 

Re: [Haskell-cafe] Generate random UArray in constant memory space.

2010-02-09 Thread Daniel Fischer
Am Dienstag 09 Februar 2010 19:27:58 schrieb Bryan O'Sullivan:
> On Tue, Feb 9, 2010 at 4:18 AM, Vasyl Pasternak
>
> wrote:
> > I tried to generate memory-efficient list of random numbers, so I've
> > used uvector library for this task.
>
> Use the mwc-random package. It provides a function that does exactly
> this, and produces better quality random numbers with much higher
> performance (1000x faster) than System.Random or even mersenne-random.

Not here.
I may be doing it wrong, but


{-# LANGUAGE BangPatterns #-}
module Main (main) where

import Text.Printf
import System.Random.MWC
import Control.Applicative
import System.Environment
import Data.Array.Vector
import Control.Monad.ST

randomListU :: (Int, Int) -> Int -> Int -- (UArr Int)
randomListU b@(l,h) size = runST $ do
let !k = h-l+1
f !m = m `mod` k + l
sg <- create
sumU . mapU f <$> uniformArray sg size

main = do
  [size] <- map read <$> getArgs
  let int = randomListU (-10, 10) size
  printf "%d\n" int


gives
$ ghc -O2 -funfolding-use-threshold=32 -fforce-recomp --make mwcRanVec.hs -
o mwcRanVec3
[1 of 1] Compiling Main ( mwcRanVec.hs, mwcRanVec.o )
Linking mwcRanVec3 ...
$ ./mwcRanVec3 +RTS -s -RTS 1000
./mwcRanVec3 1000 +RTS -s
22784
  40,966,820 bytes allocated in the heap
   3,696 bytes copied during GC
  27,128 bytes maximum residency (1 sample(s))
  26,940 bytes maximum slop
  40 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0: 1 collections, 0 parallel,  0.00s,  0.00s elapsed
  Generation 1: 1 collections, 0 parallel,  0.00s,  0.00s elapsed

  INIT  time0.00s  (  0.00s elapsed)
  MUT   time1.09s  (  1.15s elapsed)
  GCtime0.00s  (  0.00s elapsed)
  EXIT  time0.00s  (  0.00s elapsed)
  Total time1.10s  (  1.15s elapsed)

  %GC time   0.0%  (0.0% elapsed)

  Alloc rate37,376,166 bytes per MUT second

  Productivity  99.6% of total user, 95.2% of total elapsed

, System.Random.Mersenne

$ ./mtRanVec +RTS -s -RTS 1000
./mtRanVec 1000 +RTS -s
-24541
 280,609,188 bytes allocated in the heap
  17,404 bytes copied during GC
  26,776 bytes maximum residency (1 sample(s))
  25,724 bytes maximum slop
   1 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:   535 collections, 0 parallel,  0.00s,  0.00s elapsed
  Generation 1: 1 collections, 0 parallel,  0.00s,  0.00s elapsed

  INIT  time0.00s  (  0.00s elapsed)
  MUT   time1.10s  (  1.10s elapsed)
  GCtime0.00s  (  0.00s elapsed)
  EXIT  time0.00s  (  0.00s elapsed)
  Total time1.10s  (  1.10s elapsed)

  %GC time   0.4%  (0.4% elapsed)

  Alloc rate255,083,261 bytes per MUT second

  Productivity  99.3% of total user, 99.5% of total elapsed

more or less the same, the System.Random code gives

$ ./uRanVec +RTS -s -RTS 1000./uRanVec 1000 +RTS -s
5130
   4,515,826,700 bytes allocated in the heap
 803,132 bytes copied during GC
  26,852 bytes maximum residency (1 sample(s))
  25,716 bytes maximum slop
   1 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:  8680 collections, 0 parallel,  0.10s,  0.10s elapsed
  Generation 1: 1 collections, 0 parallel,  0.00s,  0.00s elapsed

  INIT  time0.00s  (  0.00s elapsed)
  MUT   time9.12s  (  9.17s elapsed)
  GCtime0.10s  (  0.10s elapsed)
  EXIT  time0.00s  (  0.00s elapsed)
  Total time9.22s  (  9.27s elapsed)

  %GC time   1.1%  (1.1% elapsed)

  Alloc rate495,342,570 bytes per MUT second

  Productivity  98.9% of total user, 98.3% of total elapsed

(so a factor of a little above 8), and the specialised System.Random code 
in the source file,


{-# LANGUAGE BangPatterns #-}
module Main (main) where

import Text.Printf
import Control.Applicative
import System.Environment
import Data.Array.Vector

randomListU :: (Int, Int) -> StdGen -> Int -> (UArr Int)
randomListU b@(l,h) g size = unfoldU size gen g
  where
!k = h-l+1
!b' = 2147483561 `rem` k
gen !g = let (!x, !g') = stdNext g
 in JustS ((l+ (x+b') `rem` k) :*: g')

main = do
  [size] <- map read <$> getArgs
  let ints = randomListU (-10, 10) (mkStdGen 1) size
  printf "%d\n"  (sumU ints)

data StdGen = StdGen {-# UNPACK #-} !Int {-# UNPACK #-} !Int

mkStdGen :: Int -> StdGen
mkStdGen s
 | s < 0 = mkStdGen (-s)
 | otherwise = StdGen (s1+1) (s2+1)
  where
(q, s1) = s `divMod` 2147483562
s2  = q `mod` 2147483398

{-# INLINE stdNext #-}
stdNext :: StdGen -> (Int, StdGen)
-- Returns values in the range stdRange
stdNext (StdGen s1 s2) = z' `seq` g' `seq` (z', g')
where
!g'   = StdGen s1'' s2''
!z'   = if z < 1 then z + 2147483562 else z
!z=

Re: [Haskell-cafe] Generate random UArray in constant memory space.

2010-02-09 Thread Vasyl Pasternak
Daniel,

Yes, I have 64 bit system.

Maybe you're right. The PRNG code with the same vector size allocates
two times more memory at my PC. (~ 1 Gb)

Thank you,
Vasyl

2010/2/9 Daniel Fischer :
> Am Tuesday 09 February 2010 19:19:18 schrieben Sie:
>> Daniel,
>>
>> I've just run venum2 program locally and here is my results:
>>
>> $ ./venum2 1000 +RTS -s
>> ./venum2 1000 +RTS -s
>> 500500
>>           22,736 bytes allocated in the heap
>>              688 bytes copied during GC
>>           17,184 bytes maximum residency (1 sample(s))
>>           19,680 bytes maximum slop
>>                1 MB total memory in use (0 MB lost due to fragmentation)
>>
>> $ ./venum2 10 +RTS -s
>> ./venum2 10 +RTS -s
>> 55
>>           24,152 bytes allocated in the heap
>>              688 bytes copied during GC
>>           17,184 bytes maximum residency (1 sample(s))
>>           19,680 bytes maximum slop
>>                1 MB total memory in use (0 MB lost due to fragmentation)
>>
>> So my PC shows that there is constant memory allocation. Maybe I'm
>> doing something wrong ?
>
> Unlikely.
> However, I notice that you seem to have a 64-bit system.
> I don't know the details, but usually those have more registers than a 32-
> bit system, so you can get more complicated loops to run completely in the
> registers.
> I think that's what happens here, enumFromToU' is just complicated enough
> to not run in the registers on my 32-bit system, but still runs in the
> registers on your 64-bit system.
>
> The PRNG code is too complicated (too many temporary variables) to run in
> the registers on either system (BTW, have you tried it with the specialised
> PRNG code in the source file? I'm not sure whether that might be just small
> enough to run in the registers on a 64-bit system.).
>
>>
>> 2010/2/9 Daniel Fischer :
>> > Am Tuesday 09 February 2010 15:43:13 schrieben Sie:
>> >> Update:
>> >>
>> >> I've implemented `enumFromToU` through `unfoldU`:
>> >> > enumFromToU' from to = unfoldU (to - from) f from
>> >> >    where f i = let i' = i + 1 in JustS (i' :*: i')
>> >>
>> >> This code behaves similarly to `enumFromToU` (i.e. constantly uses
>> >> ~25 kb of memory, runs in the same time as above).
>> >
>> > Wait,
>> >
>> > $ cat venum2.hs
>> > module Main (main) where
>> >
>> > import Text.Printf
>> > import Control.Applicative
>> > import System.Environment
>> > import Data.Array.Vector
>> >
>> > main = do
>> >   [size] <- map read <$> getArgs
>> >   let ints = enumFromToU' 0 size :: UArr Int
>> >   printf "%d\n"  (sumU ints)
>> >
>> > enumFromToU' from to = unfoldU (to - from) f from
>> >    where f i = let i' = i + 1 in JustS (i' :*: i')
>> >
>> > $ ghc -O2 --make venum2
>> > [1 of 1] Compiling Main             ( venum2.hs, venum2.o )
>> > Linking venum2 ...
>> > $ ./venum2 +RTS -sstderr -RTS 100
>> > ./venum2 100 +RTS -sstderr
>> > 1784293664
>> >      48,256,384 bytes allocated in the heap
>> >           6,256 bytes copied during GC
>> >          26,804 bytes maximum residency (1 sample(s))
>> >          25,524 bytes maximum slop
>> >               1 MB total memory in use (0 MB lost due to
>> > fragmentation)
>> >
>> > $ ./venum2 +RTS -sstderr -RTS 1000
>> > ./venum2 1000 +RTS -sstderr
>> > -2004260032
>> >     481,937,552 bytes allocated in the heap
>> >          19,516 bytes copied during GC
>> >          26,804 bytes maximum residency (1 sample(s))
>> >          25,512 bytes maximum slop
>> >               1 MB total memory in use (0 MB lost due to
>> > fragmentation)
>> >
>> > So we have constant memory, but linear allocation, just like with the
>> > random numbers.
>> >
>> > With enumFromToU, also the allocation is constant, so unfoldU
>> > allocates, enumFromToU not.
>> >
>> >> So the question is why random number list generator eats O(n) memory
>> >> ?
>> >
>> > It doesn't, not here, at least. Using System.Random, the allocation
>> > figures are about ten times as high, but the residency is about the
>> > same. Putting the PRNG code in the same file and tweaking things a bit
>> > (eliminating all intermediate Integers, e.g.), I get
>> >
>> > $ cat ran2Vec.hs
>> > {-# LANGUAGE BangPatterns #-}
>> > module Main (main) where
>> >
>> > import Text.Printf
>> > import Control.Applicative
>> > import System.Environment
>> > import Data.Array.Vector
>> >
>> > randomListU :: (Int, Int) -> StdGen -> Int -> (UArr Int)
>> > randomListU b@(l,h) g size = unfoldU size gen g
>> >   where
>> >    !k = h-l+1
>> >    !b' = 2147483561 `rem` k
>> >    gen !g = let (!x, !g') = stdNext g
>> >             in JustS ((l+ (x+b') `rem` k) :*: g')
>> >
>> > main = do
>> >   [size] <- map read <$> getArgs
>> >   let ints = randomListU (-10, 10) (mkStdGen 1) size
>> >   printf "%d\n"  (sumU ints)
>> >
>> > data StdGen = StdGen {-# UNPACK #-} !Int {-# UNPACK #-} !Int
>> >
>> > mkStdGen :: Int -> StdGen
>> > mkStdGen s
>> >  | s < 0     = mkStdGen (-s)
>> >  | otherwise = StdGen (s1+1) (s2+1

Re: [Haskell-cafe] Generate random UArray in constant memory space.

2010-02-09 Thread Vasyl Pasternak
Bryan,

mwc-random is really fast. But it eats to much memory. My previous
attempts were to reduce total number of allocations. This package made
this possible, but it increases total memory usage.

Here is the code:

import Text.Printf
import System.Random.MWC
import Control.Applicative
import System.Environment
import Data.Array.Vector
import Control.Monad.ST

randomListU size = runST $ flip uniformArray size =<< create

main = do
  [size] <- map read <$> getArgs
  printf "%d\n" $ (sumU (randomListU size) :: Int)

And sample tests:

./mwcvec 1000 +RTS -s
-9198901858466039191
 165,312 bytes allocated in the heap
 688 bytes copied during GC
  17,184 bytes maximum residency (1 sample(s))
  19,680 bytes maximum slop
  78 MB total memory in use (1 MB lost due to fragmentation)


./mwcvec 1 +RTS -s
2242701120799374676
 165,704 bytes allocated in the heap
 688 bytes copied during GC
  17,184 bytes maximum residency (1 sample(s))
  19,680 bytes maximum slop
 764 MB total memory in use (12 MB lost due to fragmentation)

./mwcvec 10 +RTS -s
mwcvec: out of memory (requested 8000634880 bytes)

I don't know exactly, but is this a normal behavior ?

Thank you,
Vasyl

2010/2/9 Bryan O'Sullivan :
> On Tue, Feb 9, 2010 at 4:18 AM, Vasyl Pasternak 
> wrote:
>>
>> I tried to generate memory-efficient list of random numbers, so I've
>> used uvector library for this task.
>
> Use the mwc-random package. It provides a function that does exactly this,
> and produces better quality random numbers with much higher performance
> (1000x faster) than System.Random or even mersenne-random.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Generate random UArray in constant memory space.

2010-02-09 Thread Bryan O'Sullivan
On Tue, Feb 9, 2010 at 3:48 PM, Vasyl Pasternak
wrote:

>
> mwc-random is really fast. But it eats to much memory.


It creates and returns a vector, so if you ask it to give you a billion
items, it's going to require north of 8 gigabytes of memory. This should not
come as a surprise, I'd hope :-)  Assuming that's not what you actually
want, you should look at other entry points in the API, which you can use to
generate a single value at a time in constant space.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Generate random UArray in constant memory space.

2010-02-09 Thread Felipe Lessa
On Tue, Feb 09, 2010 at 04:27:57PM -0800, Bryan O'Sullivan wrote:
> It creates and returns a vector, so if you ask it to give you a billion
> items, it's going to require north of 8 gigabytes of memory. This should not
> come as a surprise, I'd hope :-)  Assuming that's not what you actually
> want, you should look at other entry points in the API, which you can use to
> generate a single value at a time in constant space.

He thought the vector would be fused away by the library, which
is one of the selling points of uvector.  Sadly the
implementation of uniformArray wasn't done with this purpose in
mind.

--
Felipe.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Generate random UArray in constant memory space.

2010-02-10 Thread Vasyl Pasternak
Hi all,

To summarize everything in this thread I've tested mwc-random,
System.Random and mersenne random numbers (mersenne-random-pure64).
Here the score table:

[THIRD PLACE] Generic Random Number Generator. Is the slowest and
allocates too much memory in the heap. The total memory usage is
constant and very low.

[SECOND PLACE] MWC-RANDOM. The fastest random number generator ever.
But it uses O(n) memory for generate random numbers. Thought it isn't
possible to calculate a really large set of random numbers (my PC
stuck with calculating 500 millions random numbers with memory usage
above 3,5Gb). The memory usage for me is more important than time,
because I can easily wait additional 5-10-15 mins, but I cant so
easily put additional memory to my PC. Thus this is only second place.

[FIRST PLACE] Mercenne Random number generator. Approx 10 times faster
than generic and two times slower than mwc. But it works in constant
memory space, so theoretically it could generate infinite list of
numbers. It also uses 6 time less total allocations, than generic RNG.

NOTE: These tests didn't test the quality of the random sequences,
only speed/memory.

Thanks to everyone, who helped me with this code, it seems, that now I
understand optimizations much better, than a day ago.

Best regards,
Vasyl

2010/2/10 Felipe Lessa :
> On Tue, Feb 09, 2010 at 04:27:57PM -0800, Bryan O'Sullivan wrote:
>> It creates and returns a vector, so if you ask it to give you a billion
>> items, it's going to require north of 8 gigabytes of memory. This should not
>> come as a surprise, I'd hope :-)  Assuming that's not what you actually
>> want, you should look at other entry points in the API, which you can use to
>> generate a single value at a time in constant space.
>
> He thought the vector would be fused away by the library, which
> is one of the selling points of uvector.  Sadly the
> implementation of uniformArray wasn't done with this purpose in
> mind.
>
> --
> Felipe.
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe