[ https://issues.apache.org/jira/browse/RNG-123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alex Herbert resolved RNG-123. ------------------------------ Resolution: Implemented > PCG generators may exhibit massive stream correlation > ----------------------------------------------------- > > Key: RNG-123 > URL: https://issues.apache.org/jira/browse/RNG-123 > Project: Commons RNG > Issue Type: Improvement > Components: core > Affects Versions: 1.3 > Reporter: Sebastiano Vigna > Assignee: Alex Herbert > Priority: Major > Fix For: 1.4 > > Time Spent: 0.5h > Remaining Estimate: 0h > > [This is based on an issue I posted on the Rust development mailing list.] > The documentation of the PCG generators does not state explicit that > different seeds generate independent sequences, but the existence of a > 128-bit seed implies somehow the idea that the whole seed is meaningful. > The user should be made aware that the second parameter (the constant of the > underlying LCG) is almost useless from a mathematical and statistical > viewpoint. > Changing constant to an LCG with power-of-two modulus and a constant like the > one used in PCG simply adds a constant to the sequence (more precisely, there > are two equivalence classes of constants, and in each equivalence class the > sequences are identical, except for an additive constant). > The minimal scrambling done by the generator usually cannot cancel this fact, > and as a result changing the constant is equivalent to changing the initial > state (modulo an additive constant). You can try to run this program: > > {noformat} > import org.apache.commons.rng.core.source32.PcgXshRr32; > import com.google.common.primitives.Ints; > public class TestPCG { > public static void main(final String[] args) { > final long state = Long.parseLong(args[0]); > final long c = Long.parseLong(args[1]); > final long d = Long.parseLong(args[2]); > if (c % 2 != d % 2) throw new IllegalArgumentException(); > final long C = c << 1 | 1; > final long D = d << 1 | 1; > final long r = 1314878037273365987L * ((d - c) >>> 1); > final PcgXshRr32 rng0 = new PcgXshRr32(new long[] { state, c }); > final PcgXshRr32 rng1 = new PcgXshRr32(new long[] { > 0xc097ef87329e28a5L *(6364136223846793005L * (state + C) + C - r > - D) - D, d }); > for(;;) { > final int a = rng0.nextInt(); > System.out.write(Ints.toByteArray(a), 0, 4); > final int b = rng1.nextInt(); > System.out.write(Ints.toByteArray(b), 0, 4); > } > } > }{noformat} > You can pass any state as first argument, and any two constants as the > following two arguments, as long as they are either both even or both odd . > The program will set up a second initial state so that the sequences > generated by the PRNGs using the two constants as seed are based on almost > identical underlying LCG sequences, in spite of having arbitrary, different > constants and different initial states. The two streams should be > independent, but if you pipe the output in PractRand you'll get immediately > {noformat} > rng=RNG_stdin32, seed=unknown > length= 4 megabytes (2^22 bytes), time= 2.1 seconds > Test Name Raw Processed Evaluation > BCFN(0+0,13-5,T) R=+263.2 p = 3.4e-103 FAIL !!!!! > BCFN(0+1,13-5,T) R=+128.6 p = 1.5e-50 FAIL !!!! > BCFN(0+2,13-6,T) R= +65.2 p = 9.2e-23 FAIL !! > BCFN(0+3,13-6,T) R= +15.4 p = 1.0e-5 mildly suspicious > DC6-9x1Bytes-1 R= +59.1 p = 4.2e-33 FAIL !!! > DC6-6x2Bytes-1 R= +34.1 p = 9.0e-19 FAIL ! > DC6-5x4Bytes-1 R= +15.2 p = 7.7e-8 very suspicious > [Low4/16]BCFN(0+1,13-6,T) R= +12.0 p = 1.5e-4 unusual > [Low4/16]FPF-14+6/64:(4,14-8) R= +9.2 p = 1.1e-6 unusual > [...] > [Low8/32]FPF-14+6/4:(9,14-9) R= +27.4 p = 1.6e-17 FAIL > [Low8/32]FPF-14+6/4:(10,14-10) R= +16.4 p = 2.5e-9 suspicious > [Low8/32]FPF-14+6/4:all R=+283.4 p = 8.4e-255 FAIL !!!!!! > [Low8/32]Gap-16:A R=+414.8 p = 2.4e-336 FAIL !!!!!!! > [Low8/32]Gap-16:B R= +1736 p = 5e-1320 FAIL > !!!!!!!!{noformat} > You can also offset one of generator by hundred of iterations, but the > sequences are so correlated that the result won't change. If you peek at the > state of the two generators you'll see that their difference is constant. > I think the reader should be made aware of the danger. If you start several > generators of this kind the state is too small to guarantee that there will > be no overlap. Once you get overlap, since there are in practice just two > sequences, you will get a lot of unwanted correlation. > There's a reason why nobody in the last decades ever considered creating > "streams" using the constant part of an LCG, and it's that people realized > very early that it doesn't work (see, e.g., > [https://ieeexplore.ieee.org/document/718715]). > -- This message was sent by Atlassian Jira (v8.3.4#803005)