[ 
https://issues.apache.org/jira/browse/RNG-173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Herbert updated RNG-173:
-----------------------------
    Component/s: core

> BaseProvider state filling procedure can be improved
> ----------------------------------------------------
>
>                 Key: RNG-173
>                 URL: https://issues.apache.org/jira/browse/RNG-173
>             Project: Commons RNG
>          Issue Type: Improvement
>          Components: core
>            Reporter: Alex Herbert
>            Priority: Trivial
>             Fix For: 1.5
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> The BaseProvider has a method to fill in remaining state if the input seed is 
> too short. The fill uses existing seed values to fill the remaining.
> The next state is created using:
> {code:java}
> long n = state[i - seed.length];
> state[i] = 1812433253L * (n ^ (n >> 30)) + i{code}
> If the existing state is zero then the new state is i. When the input seed 
> has no length then the filled state is a natural sequence.
> Here is a state of 10 filled from empty seeds of length 0 to 5:
> {noformat}
> 0: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
> 1: [0, 1, 1812433255, 3284914298392595265, 6102061520201954364, 
> -3308799481182342998, -3869692221293809580, -7101959917617921332, 
> 7986832403292652032, 8936067391732911773]
> 2: [0, 0, 2, 3, 3624866510, 5437299764, 6569828598597623783, 
> -8592001180344199076, 1136775338421644002, 8717367692712810396]
> 3: [0, 0, 0, 3, 4, 5, 5437299765, 7249733019, 9062166273, 
> -8592001182156632327]
> 4: [0, 0, 0, 0, 4, 5, 6, 7, 7249733020, 9062166274]
> 5: [0, 0, 0, 0, 0, 5, 6, 7, 8, 9]
> {noformat}
> When the seed is zero length or close to half the length of the desired state 
> and all zeros then the output state has a low number of non-zero bits.
> Note:
> This has little impact when using the Commons RNG simple module to create a 
> generator. The seed is produced to the correct length using a high quality 
> random source.
> A second issue is that the method to fill the state is an instance method. 
> Since it uses no state it could be a static method. I would suggest a method 
> to convert a seed to the correct length:
> {code:java}
> protected static long[] ensureSeedLength(long[] seed, int length); {code}
> This would allow classes that implement the following pattern:
> {code:java}
> MyRNG(long[] seed) {
>     if (seed.length < SEED_SIZE) {
>         final long[] state = new long[SEED_SIZE];
>         fillState(state, seed);
>         setState(state);
>     } else {
>         setState(seed);
>     }
> } {code}
> To simplify to:
> {code:java}
> MyRNG(long[] seed) {
>     setState(ensureSeedLength(seed, SEED_SIZE));
> }{code}
> h2. Compatibility
> The user guide states:
> {noformat}
> upon initialization, the underlying generation algorithm
> - may not use all the information contents of the seed,
> - may use a procedure (using the given seed as input) for further filling its 
> internal state (in order to avoid a too uniform initial state).
> In both cases, the behavior is not standard but should not change between 
> releases of the library (bugs notwithstanding).{noformat}
> Since behaviour *should not change* it would rule out changes for existing 
> classes. New classes could use the new static version to fill state.
> I would suggest providing a new method to ensure the input seed is a minimum 
> length. If the method seeds a SplitMix64 style generator with the first value 
> of the input seed (or zero if the seed length is zero) then the filled state 
> will be high quality. This type of generator only outputs zero once during 
> the period and so any seed length can be ensured to be non zero when it has 
> been expanded. An input seed of entirely zero values would be passed through 
> unchanged. This is the default *user beware* behaviour for full length zero 
> seeds.
> A 32-bit variant can be created using a similar hashing function that outputs 
> only a single 0 in the period, for example MurmurHash3's 32-bit finaliser 
> function.
> An example implementation for long values is:
> {code:java}
> private static final long GOLDEN_RATIO = 0x9e3779b97f4a7c15L
> protected static long[] ensureSeedLength(long[] seed, int length) {
>     if (seed.length < length) {
>         final long[] s = Arrays.copyOf(seed, length);
>         // Fill the rest as if using a SplitMix64 RNG
>         long x = s[0];
>         for (int i = seed.length; i < length; i++) {
>             s[i] = stafford13(x += GOLDEN_RATIO);
>         }
>         return s;
>     }
>     return seed;
> }
> private static long stafford13(long x) {
>     x = (x ^ (x >>> 30)) * 0xbf58476d1ce4e5b9L;
>     x = (x ^ (x >>> 27)) * 0x94d049bb133111ebL;
>     return x ^ (x >>> 31);
> }
> {code}
> A 32-bit mix function for Murmur32 is:
> {code:java}
> private static int murmur32(int x) {
>     x = (x ^ (x >>> 16)) * 0x85ebca6b;
>     x = (x ^ (x >>> 13)) * 0xc2b2ae35;
>     return x ^ (x >>> 16);
> }{code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to