I'll point out that there is there is a large literature on generating pseudo random numbers for parallel processes, and it is not as easy as one (at least me) would intuitively think. By a contra-positive like thinking one might guess that it will not be easy to pick seeds in a way that will produce independent sequences.

(I'm a bit confused about the objective but) If the objective is to produce independent sequence from some different seeds then the RNGs for parallel processing might be a good place to start. (And, BTW, if you want to reproduce parallel generated random numbers you need to keep track of both the starting seed and the number of nodes.)

Paul Gilbert

On 11/05/2017 10:58 AM, peter dalgaard wrote:

On 5 Nov 2017, at 15:17 , Duncan Murdoch <murdoch.dun...@gmail.com> wrote:

On 04/11/2017 10:20 PM, Daniel Nordlund wrote:
Tirthankar,
"random number generators" do not produce random numbers.  Any given
generator produces a fixed sequence of numbers that appear to meet
various tests of randomness.  By picking a seed you enter that sequence
in a particular place and subsequent numbers in the sequence appear to
be unrelated.  There are no guarantees that if YOU pick a SET of seeds
they won't produce a set of values that are of a similar magnitude.
You can likely solve your problem by following Radford Neal's advice of
not using the the first number from each seed.  However, you don't need
to use anything more than the second number.  So, you can modify your
function as follows:
function(x) {
        set.seed(x, kind = "default")
        y = runif(2, 17, 26)
        return(y[2])
      }
Hope this is helpful,

That's assuming that the chosen seeds are unrelated to the function output, 
which seems unlikely on the face of it.  You can certainly choose a set of 
seeds that give high values on the second draw just as easily as you can choose 
seeds that give high draws on the first draw.

The interesting thing about this problem is that Tirthankar doesn't believe 
that the seed selection process is aware of the function output.  I would say 
that it must be, and he should be investigating how that happens if he is 
worried about the output, he shouldn't be worrying about R's RNG.


Hmm, no. The basic issue is that RNGs are constructed so that with x_{n+1} = 
f(x_n),
x_1, x_2, x_3,... will look random, not so that f(s_1), f(s_2), f(s_3), ... 
will look random for any s_1, s_2, ... . This is true, even if seeds s_1, s_2, 
... are not chosen so as to mess with the RNG. In the present case, it seems 
that the seeds around 86e6 tend to give similar output. On the other hand, it 
is not _just_ the similarity in magnitude that does it, try e.g.

s <- as.integer(runif(1000000, 86.54e6, 86.98e6))
r <- sapply(s, function(s){set.seed(s); runif(1,17,26)})
plot(s,r, pch=".")

and no obvious pattern emerges. My best guess is that the seeds are not only of 
similar magnitude, but also have other bit-pattern similarities.

(Isn't there a Knuth quote to the effect that "Every random number generator will 
fail in at least one application"?)

One remaining issue is whether it is really true that the same seeds givee 
different output on different platforms. That shouldn't happen, I believe.


Duncan Murdoch

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to