Re: [Numpy-discussion] match RNG numbers with R?
Yaroslav Halchenko wrote: > it is in src/main/RNG.c (ack is nice ;) )... from visual inspection looks > "matching" I see... It's a rather vanilla Mersenne Twister, and it just use 32 bits of randomness. An signed int32 is multiplied by 2.3283064365386963e-10 to scale it to [0,1). Then they also have a function called "fixup" that prevents 0 or 1 from being returned. If the generated random value x <= 0.0 they return 0.5*2.328306437080797e-10, and if (1.0-x)<=0.0 they return 1.0-0.5*2.328306437080797e-10. http://svn.r-project.org/R/branches/R-3-1-branch/src/main/RNG.c Also note that the algorithm used to set the seed is important. If you set a seed equal 1 in NumPy, that might not mean the same in R. And then of course there is the algorithm used to sample integers. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] match RNG numbers with R?
On Mon, Apr 7, 2014 at 3:16 PM, Daπid wrote: > > On Apr 7, 2014 3:59 AM, "Yaroslav Halchenko" wrote: >> so I would assume that the devil is indeed in R post-processing and would >> look >> into it (if/when get a chance). > > The devil here is the pigeon and the holes problem. Mersenne Twister > generates random integers in a certain range. The output is guaranteed to be > deterministic, uniform, and reproducible. > > But when you want to cast those m possible input in n possible outputs, you > need to do magic (or maths) to keep the new distribution truly uniform. > Simply getting random bytes and viewing them as ints will give you low > quality random numbers. > > The algorithm that casts MT output to a random integer is probably what is > different, and perhaps you could find it documented somewhere. This is ours: https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/randomkit.c#L259 -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] match RNG numbers with R?
On Apr 7, 2014 3:59 AM, "Yaroslav Halchenko" wrote: > so I would assume that the devil is indeed in R post-processing and would look > into it (if/when get a chance). The devil here is the pigeon and the holes problem. Mersenne Twister generates random integers in a certain range. The output is guaranteed to be deterministic, uniform, and reproducible. But when you want to cast those m possible input in n possible outputs, you need to do magic (or maths) to keep the new distribution truly uniform. Simply getting random bytes and viewing them as ints will give you low quality random numbers. The algorithm that casts MT output to a random integer is probably what is different, and perhaps you could find it documented somewhere. As a side note, C++11 includes in the standard a MT RNG guaranteed to be the same across implementations, but there is no promise about the distributions -not even across versions of the same implementation. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] match RNG numbers with R?
On Mon, 07 Apr 2014, Sturla Molden wrote: > > so I would assume that the devil is indeed in R post-processing and would > > look > > into it (if/when get a chance). > I tried to look into the R source code. It's the worst mess I have ever > seen. I couldn't even find their Mersenne twister. it is in src/main/RNG.c (ack is nice ;) )... from visual inspection looks "matching" -- Yaroslav O. Halchenko, Ph.D. http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org Senior Research Associate, Psychological and Brain Sciences Dept. Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] match RNG numbers with R?
Yaroslav Halchenko wrote: > so I would assume that the devil is indeed in R post-processing and would look > into it (if/when get a chance). I tried to look into the R source code. It's the worst mess I have ever seen. I couldn't even find their Mersenne twister. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] match RNG numbers with R?
On Sun, 06 Apr 2014, Sturla Molden wrote: > > R, Python std library, numpy all have Mersenne Twister RNG implementation. > > But > > all of them generate different numbers. This issue was previously > > discussed in > > https://github.com/numpy/numpy/issues/4530 : In Python, and numpy generated > > numbers are based on using 53 bits of two 32 bit random integers generated > > by > > the algorithm (see below).Upon my brief inspection, original 32bit > > numbers > > are nohow available for access neither in NumPy nor in Python stdlib > > implementation. > NumPy uses the Randomkit library. The source is here: > https://github.com/numpy/numpy/tree/master/numpy/random/mtrand > It very easy to modify to produce whatever result you want. You can build > it separately from NumPy. > RandomState.bytes and np.random.bytes gives you the original random bytes > in little endian order. It basically calles rk_fill in Randomkit and then > unpacks the 32 bit random integer into four bytes. Just view it as > dtype=' a = np.random.bytes(4*n).view(dtype='> np.random.seed(1); [int(np.asscalar(np.fromstring(np.random.bytes(4), >> dtype='http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org Senior Research Associate, Psychological and Brain Sciences Dept. Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] match RNG numbers with R?
> a = np.random.bytes(4*n).view(dtype='> 5).astype(np.int32) b = (np.random.bytes(4*n).view(dtype='> 6).astype(np.int32) r = (a * 67108864.0 + b) / 9007199254740992.0 Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] match RNG numbers with R?
Yaroslav Halchenko wrote: > R, Python std library, numpy all have Mersenne Twister RNG implementation. > But > all of them generate different numbers. This issue was previously discussed > in > https://github.com/numpy/numpy/issues/4530 : In Python, and numpy generated > numbers are based on using 53 bits of two 32 bit random integers generated by > the algorithm (see below).Upon my brief inspection, original 32bit numbers > are nohow available for access neither in NumPy nor in Python stdlib > implementation. NumPy uses the Randomkit library. The source is here: https://github.com/numpy/numpy/tree/master/numpy/random/mtrand It very easy to modify to produce whatever result you want. You can build it separately from NumPy. RandomState.bytes and np.random.bytes gives you the original random bytes in little endian order. It basically calles rk_fill in Randomkit and then unpacks the 32 bit random integer into four bytes. Just view it as dtype='http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] match RNG numbers with R?
Hi NumPy gurus, We wanted to test some of our code by comparing to results of R implementation which provides bootstrapped results. R, Python std library, numpy all have Mersenne Twister RNG implementation. But all of them generate different numbers. This issue was previously discussed in https://github.com/numpy/numpy/issues/4530 : In Python, and numpy generated numbers are based on using 53 bits of two 32 bit random integers generated by the algorithm (see below).Upon my brief inspection, original 32bit numbers are nohow available for access neither in NumPy nor in Python stdlib implementation. I wonder if I have missed something and there is an easy way (i.e. without reimplementing core algorithm, or RPy'ing numbers from R) to generate random numbers in Python to match the ones in R? Excerpt from http://nbviewer.ipython.org/url/www.onerussian.com/tmp/random_randomness.ipynb # R %R RNGkind("Mersenne-Twister"); set.seed(1); sample(0:9, 10, replace=T) array([2, 3, 5, 9, 2, 8, 9, 6, 6, 0], dtype=int32) # stock Python random.seed(1); [random.randint(0, 10) for i in range(10)] [1, 9, 8, 2, 5, 4, 7, 8, 1, 0] # numpy rng = nprandom.RandomState(1); [rng.randint(0, 10) for i in range(10)] [5, 8, 9, 5, 0, 0, 1, 7, 6, 9] -- Yaroslav O. Halchenko, Ph.D. http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org Senior Research Associate, Psychological and Brain Sciences Dept. Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion