Re: suggestion for Random.randomR

2000-01-12 Thread Matt Harden

George Russell wrote:
 
 Matt Harden wrote:
  I don't think that's really true.  If I understand it correctly, the
  state can be any type; it doesn't have to fit in, say, an Int or other
  small type.  I think the Mersenne Twister could be implemented as an
  instance of Random.RandomGen.  The only thing is I don't really know the
  best way to implement "split" for MT.
 Well yes you can implement MT using the Haskell Random's
 existing framework, but since the "next" function,
 which steps the generator, has type
   next :: g - (Int,g)
 the compiler must either
   (1) update the entire state persistently, which probably
   means some horrid hack with trees, if you don't want
   to have to copy arrays around
 or
   (2) do static analysis to detect that (as in most cases)
   the generator is not being replicated, so you don't
   need persistence.  I think the "Haskell Way", which
   is better, is instead to enforce single-threading via
   some kind of monad.
 Actually I would argue that this is a rare case where persistence is
 not only not wanted but dangerous.  If you have old copies of the state
 hanging around they may get accidentally used so that you have the same
 non-random numbers occurring again.  You do _sometimes_ want to set and
 unset the seed, perhaps once per program, but it is surely better to
 have explicit copying only in this case.
 [snip[
  Yes, and the current interface provides randomIO and randomRIO for those
  who don't want to have to pass the RandomGen instance around.  The cost
  is they have to use the IO monad.
 True, but a conformant implementation has to implement randomIO and randomRIO
 using the same type of state (see section 17.3 of the library standard) and
 so they will suffer from the same disadvantages.  Also the interface I suggest
 is superior because it allows more than one random number generator.  (You
 might for example have an independent part of the program which relies on
 having its own personal deterministic reseedable generator.)

I see your point, if you assume the MT is implemented in terms of
Haskell arrays.  Although MT is "defined" in terms of mutable arrays
(the C code uses them), my particular implementation uses "infinite"
lazy lists instead.  Also, even if MT is implemented with Haskell
arrays, the array itself is only rebuilt once every 624 longwords, and
that's the only time the array has to be copied.  The state would just
be a reference to the array, and an integer pointing to the current
location in the array.  When the array is recreated, a "smart" compiler
would realize that the old array is no longer needed, and would produce
code to update the array in place.  A not-so-smart compiler would simply
collect the old arrays with the GC when they were no longer referenced.

I don't see how an old copy of the state could "accidentally" get used. 
If you pass the same state or "seed" to two parallel routines, then of
course they would get the same random numbers, and that may have been
what you intended.  You can't protect the programmer from his own bad
code.  The compiler would not accidentally use some old version of the
state, unless it was a very brain dead compiler.  If you want two
different random streams from the same seed, you are supposed to use
split().

Btw, I agree that if you have a RNG for which the state is large and
changes with each call to next(), then you can end up with a very
inefficient implementation in the current random framework.  The
Mersenne Twister doesn't happen to fit that profile.

If you force all uses of random numbers to be done in a special monad
(or IO), you *might* make the situation better in those cases, but that
comes at a cost.  I feel that if the monad is IO, that cost is too
high.  If the monad is not IO, and gives you access to get/set the state
and run with multiple random streams, then it's equivalent to what we
have now, so I wouldn't argue too loudly against that.

Yes, your suggested interface allows more than one RNG, but then so does
the current system.  There's nothing superior about your interface from
that perspective.



Re: suggestion for Random.randomR

2000-01-07 Thread George Russell

Matt Harden wrote:
 I don't think that's really true.  If I understand it correctly, the
 state can be any type; it doesn't have to fit in, say, an Int or other
 small type.  I think the Mersenne Twister could be implemented as an
 instance of Random.RandomGen.  The only thing is I don't really know the
 best way to implement "split" for MT.
Well yes you can implement MT using the Haskell Random's
existing framework, but since the "next" function,
which steps the generator, has type
  next :: g - (Int,g)
the compiler must either
  (1) update the entire state persistently, which probably
  means some horrid hack with trees, if you don't want
  to have to copy arrays around
or
  (2) do static analysis to detect that (as in most cases)
  the generator is not being replicated, so you don't
  need persistence.  I think the "Haskell Way", which
  is better, is instead to enforce single-threading via
  some kind of monad.
Actually I would argue that this is a rare case where persistence is
not only not wanted but dangerous.  If you have old copies of the state
hanging around they may get accidentally used so that you have the same
non-random numbers occurring again.  You do _sometimes_ want to set and
unset the seed, perhaps once per program, but it is surely better to 
have explicit copying only in this case.
[snip[
 Yes, and the current interface provides randomIO and randomRIO for those
 who don't want to have to pass the RandomGen instance around.  The cost
 is they have to use the IO monad.
True, but a conformant implementation has to implement randomIO and randomRIO
using the same type of state (see section 17.3 of the library standard) and
so they will suffer from the same disadvantages.  Also the interface I suggest
is superior because it allows more than one random number generator.  (You
might for example have an independent part of the program which relies on
having its own personal deterministic reseedable generator.)



RE: suggestion for Random.randomR

1999-12-23 Thread Simon Peyton-Jones

Sergey

The essence of your message is that the H98 Random library
defn of randomR doesn't really make sense if the type does
not belong to Ord.

I don't want to specify that
|   It is required randomR (lo,hi) g == randomR (hi,lo) g
as you suggest. That would be counter-intuitive for numbers.

How about this instead?

"If the element type 't' is not in class Ord, then the
author of the instance for Random t is responsible for
specifying 
  a) what "uniformly distributed" means
  b) what subset of t is specified by the pair (lo,hi)
"
Simon


| -Original Message-
| From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
| Sent: 22 December 1999 20:17
| To: [EMAIL PROTECTED]
| Subject: suggestion for Random.randomR
| 
| 
| Suggestion for  Haskell-2 library module  Random.
| 
| 
| Haskell-98 declares
| 
| class Random a where  randomR :: RandomGen g = (a, a) - g - (a, g)
|   ... 
| ...
| instance Random Int where ...
| instance Random Integer where ...
| ...
|  * randomR  takes a range (lo,hi) and a random number generator g, and
|returns a random value 
|   uniformly distributed   in the   closed interval [lo,hi], 
|together with a new generator. 
|   It is unspecified what happens if  lo  hi. 
|...
| -
| - here i re-formatted the text and marked two lines with `'.
| 
| 
| I suggest to change this "" part to
| 
| -- (h2) --
|   ...
|   "uniformly distributed" in the "closed interval" [lo,hi], 
|   together with a new generator. 
|   It is required randomR (lo,hi) g == randomR (hi,lo) g
|   to hold for each  g.
|   "Uniformly distributed" means the usual notion for the "numeric"
|   types  Integer,Float,... - as well as for finite ones Bool,Char...
|   Defining the Random instance for other types
|   (say,  T = (Random a, Random b) = (a,b) ),
|   the programmer has to rely on one's own understanding of 
|   "uniform distribution" in domain T.
|   Similar is the notion "closed interval [lo,hi]".
|   For  Integer, Float, ...,  it is standard. 
|   For other domains T, maybe even not supplied with ordering, it 
|   means some subset in T defined by  lo, hi :: T, 
|   this subset specification is on the programmer.
| 
| 
| I do not insist on this formulation literally. 
| This is only the idea.
| Reason:
| the programmer could define, for example,
| 
|   instance (Random a, Random b) = Random (a,b)   --instance (p)
| where
| randomR ((l,l'),(h,h')) g = case  randomR (l,h) g  of
| 
|   (a,g') - case randomR (l',h') g' of  (b,g'') - ((a,b),g'')
| 
| Then, the obstacles arise:
| (1) what does might mean "closed interval [lo,hi]"  in (a,b) ?
| (2) applying  instance (p)  to, say,  (Int,Int)
| may cause evaluation of   randomR (l,h) g  with  l  h  :: Int.
| 
| The above suggestion (h2) provides some explanation for this.
| 
| Personally, my aim was to provide random polynomials  f :: P a  
| over various coefficient domains `a', for the need of testing of 
| some functions. And the format   
| randomR (polynomial1,polynomial2) g 
| 
| derived from what  Haskell-98 Random  provides, looks useful to 
| define random values inside various subsets of  P a. 
| 
| 
| --
| Sergey Mechveliani
| [EMAIL PROTECTED]
| 
| 
| 
| 
| 
| 
| 
| 
| 



Re: suggestion for Random.randomR

1999-12-23 Thread George Russell

Personally I have only one gripe with the Random class in Standard Haskell;
this is that it provides too much functionality.  
In general I think you can only have two of the following 3:
(1) good random numbers 
(2) speed
(3) small state
For example the Mersenne Twister is very very good and fast but has huge
state (several kilobytes), see
   http://www.math.keio.ac.jp/~matumoto/emt.html 
and there is even a Haskell implementation.

GHC's current generator is quite good (it passes the Diehard test, which
makes it better than Java's, but on the other hand I wouldn't trust it for more
than about 2^40 integers), but also pretty slow, even if you don't count
the fact that at each step it has to box up a tuple containing the state.

But the current Haskell implementation forces you to have small state, and
so you must have a slow generator.  You don't really need this.  In most
applications, if you need to get at the state at all, you only need to
do it pretty rarely.  The following interface would suffice.  I agree it
would be slightly yucky if you wanted to create lots of random values in
a structure like a tree, because every function generating a substructure
would have to be an action.  In GHC or Hugs, it would be neater if most of these 
functions were ST RandomGen or something like that.
   
random :: IO a -- for all things in class Random, eg integers, booleans
randomR (as for current interface) to generate ranges
-- only experts really need to look below this line.

random = random' systemRandomGen
random' :: RandomGen - IO a

type RandomGen -- holds state for random generator.
systemRandomGen :: RandomGen -- default generator 

type RandomGenContents -- encodes state of random generator

instance Read RandomGenContents
instance Show RandomGenContents 

setRandomGen :: RandomGenContents - RandomGen - IO()
getRandomGen :: RandomGen - IO RandomGenContents

makeRandomGenContents :: Integer - RandomGenContents -- deterministic
makeRandomGen :: RandomGenContents - IO RandomGen



Re: suggestion for Random.randomR

1999-12-23 Thread Matt Harden

George Russell wrote:
 
 Personally I have only one gripe with the Random class in Standard Haskell;
 this is that it provides too much functionality.
 In general I think you can only have two of the following 3:
 (1) good random numbers
 (2) speed
 (3) small state
 For example the Mersenne Twister is very very good and fast but has huge
 state (several kilobytes), see
http://www.math.keio.ac.jp/~matumoto/emt.html
 and there is even a Haskell implementation.

Nice to see my humble effort recognized.  I don't really think of ~2.5k
as all that huge, but I agree it's a lot bigger than what you normally
see in a simple PRNG.  Of course you get a much better semblance of
randomness because of it.

 But the current Haskell implementation forces you to have small state, and
 so you must have a slow generator.

I don't think that's really true.  If I understand it correctly, the
state can be any type; it doesn't have to fit in, say, an Int or other
small type.  I think the Mersenne Twister could be implemented as an
instance of Random.RandomGen.  The only thing is I don't really know the
best way to implement "split" for MT.

  You don't really need this.  In most
 applications, if you need to get at the state at all, you only need to
 do it pretty rarely.

Yes, and the current interface provides randomIO and randomRIO for those
who don't want to have to pass the RandomGen instance around.  The cost
is they have to use the IO monad.

  The following interface would suffice.  I agree it
 would be slightly yucky if you wanted to create lots of random values in
 a structure like a tree, because every function generating a substructure
 would have to be an action.  In GHC or Hugs, it would be neater if most of these
 functions were ST RandomGen or something like that.

I think we should allow people to pass the RandomGen around if they
choose to, and not force them into the IO monad.  I do agree that the ST
monad would be useful also as an alternative to IO, but only as an
addition to the current interface.  randomST and randomRST could be
provided as parallel functions to randomIO/RIO.  There would also have
to be a setRandomST function to set the RandomGen instance initially.

Cheers,
Matt Harden