On Thu, Dec 24, 2015 at 04:53:30PM +0200, Janne Blomqvist wrote: > On Thu, Dec 24, 2015 at 1:18 AM, Steve Kargl > <s...@troutmask.apl.washington.edu> wrote: > > On Thu, Dec 24, 2015 at 12:29:02AM +0200, Janne Blomqvist wrote: > >> > >> the GFortran random number generator (RANDOM_NUMBER and RANDOM_SEED > >> intrinsics) has a number of issues that the attached preliminary patch > >> tries to address. > >> > > > > First, I think your effort to retool random numbers is a nice enhancement > > for gfortran. But, ... > > > >> - When RANDOM_SEED is called without arguments, the master_seed and > >> current thread seed is set to random data read from the OS > >> /dev/urandom device. Otherwise like above. > > > > Doesn't this break backwards compatibility with the current > > implementation. In the following program, the call to random_seed > > resets the stream. If an OS has /dev/urandom, then the two streamsr > > won't match? > > Yes, but backwards compatibility is already broken to some extent > (different behavior for multi-threaded programs, different streams due > to a different generator and initial seed), so I thought that this was > a good time to do this as well, although per se the rest of the patch > does not depend on this. I think this is better than the status quo, > because if one wants a static seed it's easy enough to do, but getting > random data from the OS is somewhat trickier, so I think it's better > to do it in the library rather than trusting end users to get it > right.
Ah, okay. I had not thought about the differences in the multithreaded cases. I agree if we break backwards compatibility a little bit, then breaking it a little bit more can't hurt too much. > > Two questions. > > > > 1) Does the patch deal with PR 52879? > > To be honest, I haven't tested. FWIW, the paper at > > http://vigna.di.unimi.it/ftp/papers/xorshift.pdf > > contains results of some experiments on how quickly the generator > escapes from a low-entropy state. > > (I think one issue behind PR 52879 is that part of the seed is only > used for generating real(8/16) and are thus unused for real(4)) Currently, we have 3 KISS generators with different seeds (of course). FX added a pass over the seeds to mix them. See the scramble_seed() and unscramble_seed() functions. > > 2) Does it maintain the correspondence between a REAL(4) and > > REAL(8) stream if the same seeds are used? For example, > > > > program foo > > integer, parameter :: seed(12) = [1, 12, 123, 1234, 12345, 123456, & > > & 654321, 54321, 4321, 321, 21, 1] > > real(4) x > > real(8) y > > call random_seed(put=seed) > > call random_number(x) > > call random_seed(put=seed) > > call random_number(y) > > print *, x, y > > end program foo > > > > % gfc -o z r.f90 && ./z > > 0.181959510 0.18195952290401995 > > No, it doesn't do this. Currently each thread only has a single set of > state variables (vs. 3 for the current). Additionally, for real(4), a > single 64-bit random value is used to create two real(4) variables, > but the performance advantage of this isn't huge compared to the naive > implementation of discarding half of the 64 random bits per call. > OTOH, it's not particularly hard to do this in user code if one wants > to, so is this feature really worth it? IHMO, this a nice feature to have. One can easily test an algorithm in REAL(4), REAL(8), REAL(10), and REAL(16) where the high order bits all match to determine a suitable precision to use. An obvious outcome is choosing a precision to balance performance vs accuracy. So, your mapping is PRNG stream: |------64-----|------64-----| 2 64-bit xorshift values. REAL(4): |--24--|--24--|--24--|--24--| 4 values (24-bit significand). REAL(8): |-----53----xx|-----53----xx| 2 values (53-bit significand). REAL(10): |------64-----|------64-----| 2 values (64-bit significand). REAL(16): |-----------113---------xxxx| 1 value (113-bit significand). The current mapping is PRNG stream: |--32--|--32--|--32--|--32--| 4 32-bit KISS values. REAL(4): |--24--|xxxxxx|xxxxxx|xxxxxx| 4 values (24-bit significand). REAL(8): |-----53----xx|xxxxxxxxxxxxx| 2 values (53-bit significand). REAL(10): |------64-----|xxxxxxxxxxxxx| 2 values (64-bit significand). REAL(16): |-----------113---------xxxx| 1 value (113-bit significand). where x means unused bits. -- Steve