Charles R Harris wrote:

> What sort of api should this be? It occurs to me that there are already 
> 4 sources of random bytes:
> 
> Initialization:
> 
> /dev/random (pseudo random, I think)
> /dev/urandom
> crypto system on windows
> 
> Pseudo random generators:
> 
> mtrand
> 
> I suppose we could add some cryptologically secure source as well. That 
> indicates to me that one set of random number generators would just be 
> streams of random bytes, possibly in 4 byte chunks. If I were doing this 
> for linux these would all look like file systems, FUSE comes to mind. 
> Another set of functions would transform these into the different 
> distributions. So, how much should stay in numpy? What sort of API are 
> folks asking for?

It would be good to also support quasi-random generators like Sobol and Halton 
sequences. They tend to only generate floating point numbers in [0, 1] (or (0, 
1) or [0, 1) or (0, 1]; I think it varies). There are probably other PRNGs that 
only output floating point numbers, but I doubt we want to implement any of 
them. You can look at the 1.6 version of RandomKit for an implementation of 
Sobol sequences and ISAAC, a cryptographic PRNG.

   http://www.jeannot.org/~js/code/index.en.html

I'm thinking that the core struct that we're going to pass around will look 
something like this:

struct random_state {
   void* state;
   unsigned int (*gen_uint32)(struct random_state*);
   double (*gen_double)(struct random_state*);
}

state -- A pointer to the generator-specific struct.

gen_uint32 -- A function pointer that yields a 32-bit unsigned integer. 
Possibly 
NULL if the generator can not implement such a generator.

gen_double -- A function pointer that yields a double-precision number in [0, 
1] 
(possibly omitting one or both of the endpoints depending on the 
implementation).


Possibly this struct needs to be expanded to include function pointers for 
destroying the state struct and for a copying the state to a new object. The 
seeding function(s) probably don't need to travel in the struct. We should 
determine if C++ will work for us, here. Unfortunately, that will force all 
extension modules that want to use this API to be C++ modules.

One other feature of the current implementation of state struct is that a few 
of 
the distributions cache information between calls. Notably, the Box-Müller 
Gaussian distribution algorithm generates two normal deviates at a time; one is 
returned, the other is cached until the next call. The binomial distribution 
computes some intermediate values that are reused if the next call has the same 
parameters. We should probably store such things separately from the core state 
struct this time.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion

Reply via email to