Re: [rng] Split and Jump functions

2019-06-10 Thread Gilles Sadowski
Hello.

LGTM.

Thanks,
Gilles

P.S. Thinking of releasing 1.3?

Le lun. 10 juin 2019 à 14:55, Alex Herbert  a écrit :
>
> I've submitted a PR with the functionality discussed in this thread for
> JumpableUniformRandomProvider [1].
>
> Some notes:
>
> - The jump must return a copy.
>
> This requires a bit of code all the way up to the base implementations
> of IntProvider and LongProvider to copy the current state. I've done
> this explicitly with copy constructors to avoid using clone() [2].
>
>
> - The jump has a few requirements that are tested:
>
> 1. The copy is a new instance of the same class.
>
> 2. The sequence output from the original RNG after the jump must match
> the reference implementation output.
>
> 3. The sequence output from the copy must match what would have been
> output from the original RNG.
>
>
> - The default implementations of the providers have cached state
>
> To ensure the state is copied this is tested by comparing the state of
> the copy with the state of the original pre-jump. This does not require
> any knowledge of the original state. Just that the state can be accesses
> using RestorableUniformRandomProvider.
>
> To ensure that the state is cleared in the original RNG post-jump
> requires that the state pre-jump is not zero and the state after the
> jump is zero.
>
> One method is to: create a generator, call nextInt or nextBoolean to
> make it cache some state, do the jump; compare the output of nextInt or
> nextBoolean to the copy and test it is different. However this requires
> that the initial state is new (all zero bytes) and so the test must be
> called in every test class for each generator by passing a new instance.
>
> Alternatively a test is made using the state from
> RestorableUniformRandomProvider. The size of the default state is known
> and post-jump is checked to ensure it is zero. This test does not
> require the pre-jump state to be zero and thus can be put into a
> parametric test for all the JumpableUniformRandomProviders.
>
> So the tests to check the output sequence post-jump are in the
> individual tests for each provider and a JumpableProvidersParametricTest
> tests the general requirements of a Jumpable RNG. This ensures all
> generators in the library pass tests to ensure common functionality (the
> copy is a new instance, the state is the same as the original, the
> post-jump state of the original has reset the cached state).
>
>
> - Some providers are LongJumpable
>
> This will be added for RNG-98 and I created this PR with future
> functionality in mind. The jump function to advance the state is named
> performJump. For those without a LongJump the coefficients are not
> passed as a parameter and can be hard coded. For any generator with a
> LongJump function the computation of the jump in the performJump
> function accepts an array of coefficients.  This will be called with the
> appropriate long jump coefficients to implement LongJumpable. The rest
> of the code will be unchanged and new tests can be added to check the
> long jump.
>
>
> Comments are welcomed.
>
> Alex
>
>
> [1] https://github.com/apache/commons-rng/pull/48
>
> [2] https://www.artima.com/intv/bloch13.html
>
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [rng] Split and Jump functions

2019-06-10 Thread Alex Herbert
I've submitted a PR with the functionality discussed in this thread for 
JumpableUniformRandomProvider [1].


Some notes:

- The jump must return a copy.

This requires a bit of code all the way up to the base implementations 
of IntProvider and LongProvider to copy the current state. I've done 
this explicitly with copy constructors to avoid using clone() [2].



- The jump has a few requirements that are tested:

1. The copy is a new instance of the same class.

2. The sequence output from the original RNG after the jump must match 
the reference implementation output.


3. The sequence output from the copy must match what would have been 
output from the original RNG.



- The default implementations of the providers have cached state

To ensure the state is copied this is tested by comparing the state of 
the copy with the state of the original pre-jump. This does not require 
any knowledge of the original state. Just that the state can be accesses 
using RestorableUniformRandomProvider.


To ensure that the state is cleared in the original RNG post-jump 
requires that the state pre-jump is not zero and the state after the 
jump is zero.


One method is to: create a generator, call nextInt or nextBoolean to 
make it cache some state, do the jump; compare the output of nextInt or 
nextBoolean to the copy and test it is different. However this requires 
that the initial state is new (all zero bytes) and so the test must be 
called in every test class for each generator by passing a new instance.


Alternatively a test is made using the state from 
RestorableUniformRandomProvider. The size of the default state is known 
and post-jump is checked to ensure it is zero. This test does not 
require the pre-jump state to be zero and thus can be put into a 
parametric test for all the JumpableUniformRandomProviders.


So the tests to check the output sequence post-jump are in the 
individual tests for each provider and a JumpableProvidersParametricTest 
tests the general requirements of a Jumpable RNG. This ensures all 
generators in the library pass tests to ensure common functionality (the 
copy is a new instance, the state is the same as the original, the 
post-jump state of the original has reset the cached state).



- Some providers are LongJumpable

This will be added for RNG-98 and I created this PR with future 
functionality in mind. The jump function to advance the state is named 
performJump. For those without a LongJump the coefficients are not 
passed as a parameter and can be hard coded. For any generator with a 
LongJump function the computation of the jump in the performJump 
function accepts an array of coefficients.  This will be called with the 
appropriate long jump coefficients to implement LongJumpable. The rest 
of the code will be unchanged and new tests can be added to check the 
long jump.



Comments are welcomed.

Alex


[1] https://github.com/apache/commons-rng/pull/48

[2] https://www.artima.com/intv/bloch13.html




-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [rng] Split and Jump functions

2019-05-03 Thread Alex Herbert



On 03/05/2019 13:04, Gilles Sadowski wrote:

Hello.

Le jeu. 2 mai 2019 à 23:49, Alex Herbert  a écrit :




On 1 May 2019, at 23:15, Gilles Sadowski  wrote:

Hi.


[...]

So do we do:

UniformRandomProvider restrict(JumpableUniformRandomProvider);
JumpableUniformRandomProvider restrict(LongJumpableUniformRandomProvider);
UniformRandomProvider restrict(RestorableUniformRandomProvider);

Or:

UniformRandomProvider unjumpable(JumpableUniformRandomProvider);
JumpableUniformRandomProvider unlongJumpable(LongJumpableUniformRandomProvider);

I'm a bit hesitant on the spelling…

Do you mean unlongJumpable vs unLongJumpable vs unlongjumpable? In that regard 
I was maintaining the likeness to unrestorable, but since there are two words 
after 'un' I put the second with camelcase.

I had noticed that the consistent name would be "unlongJumpable" but
"unlong" just looks like a typo. :-{

Or just the entire method name? I don’t like it much but its function is clear: 
allow access to jump() but not longJump().

Shall we just leave out those convenience methods until there is an
explicit need?  As discussed previously, it doesn't seem to me that
the added safety is not as useful as for "unrestorable".


OK. Seems reasonable.

Note that 'unjumpable' would apply to either Jumpable or LongJumpable. 
However as you state it is easiest to not have them to begin with.




Regards,
Gilles


UniformRandomProvider unrestorable(RestorableUniformRandomProvider);

The later option only adds two new methods. The first has 3 new methods 
(deprecating unrestorable with restrict) but suffers from having to cast 
instances of multiple interfaces to ensure the correct restrict is called.

Oops indeed.
This is too error-prone.


So this makes me favour the verbosely named option.

+1

Regards,
Gilles


Alex

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [rng] Split and Jump functions

2019-05-03 Thread Gilles Sadowski
Hello.

Le jeu. 2 mai 2019 à 23:49, Alex Herbert  a écrit :
>
>
>
> > On 1 May 2019, at 23:15, Gilles Sadowski  wrote:
> >
> > Hi.
> >
> >>> [...]
> >>
> >> So do we do:
> >>
> >> UniformRandomProvider restrict(JumpableUniformRandomProvider);
> >> JumpableUniformRandomProvider restrict(LongJumpableUniformRandomProvider);
> >> UniformRandomProvider restrict(RestorableUniformRandomProvider);
> >>
> >> Or:
> >>
> >> UniformRandomProvider unjumpable(JumpableUniformRandomProvider);
> >> JumpableUniformRandomProvider 
> >> unlongJumpable(LongJumpableUniformRandomProvider);
> >
> > I'm a bit hesitant on the spelling…
>
> Do you mean unlongJumpable vs unLongJumpable vs unlongjumpable? In that 
> regard I was maintaining the likeness to unrestorable, but since there are 
> two words after 'un' I put the second with camelcase.

I had noticed that the consistent name would be "unlongJumpable" but
"unlong" just looks like a typo. :-{

>
> Or just the entire method name? I don’t like it much but its function is 
> clear: allow access to jump() but not longJump().

Shall we just leave out those convenience methods until there is an
explicit need?  As discussed previously, it doesn't seem to me that
the added safety is not as useful as for "unrestorable".

Regards,
Gilles

>
> >
> >> UniformRandomProvider unrestorable(RestorableUniformRandomProvider);
> >>
> >> The later option only adds two new methods. The first has 3 new methods 
> >> (deprecating unrestorable with restrict) but suffers from having to cast 
> >> instances of multiple interfaces to ensure the correct restrict is called.
> >
> > Oops indeed.
> > This is too error-prone.
> >
> >> So this makes me favour the verbosely named option.
> >
> > +1
> >
> > Regards,
> > Gilles
> >
> >>
> >> Alex

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [rng] Split and Jump functions

2019-05-02 Thread Alex Herbert



> On 1 May 2019, at 23:15, Gilles Sadowski  wrote:
> 
> Hi.
> 
>>> [...]
>> 
>> So do we do:
>> 
>> UniformRandomProvider restrict(JumpableUniformRandomProvider);
>> JumpableUniformRandomProvider restrict(LongJumpableUniformRandomProvider);
>> UniformRandomProvider restrict(RestorableUniformRandomProvider);
>> 
>> Or:
>> 
>> UniformRandomProvider unjumpable(JumpableUniformRandomProvider);
>> JumpableUniformRandomProvider 
>> unlongJumpable(LongJumpableUniformRandomProvider);
> 
> I'm a bit hesitant on the spelling…

Do you mean unlongJumpable vs unLongJumpable vs unlongjumpable? In that regard 
I was maintaining the likeness to unrestorable, but since there are two words 
after 'un' I put the second with camelcase.

Or just the entire method name? I don’t like it much but its function is clear: 
allow access to jump() but not longJump().

> 
>> UniformRandomProvider unrestorable(RestorableUniformRandomProvider);
>> 
>> The later option only adds two new methods. The first has 3 new methods 
>> (deprecating unrestorable with restrict) but suffers from having to cast 
>> instances of multiple interfaces to ensure the correct restrict is called.
> 
> Oops indeed.
> This is too error-prone.
> 
>> So this makes me favour the verbosely named option.
> 
> +1
> 
> Regards,
> Gilles
> 
>> 
>> Alex
>> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [rng] Split and Jump functions

2019-05-01 Thread Gilles Sadowski
Hi.

> > [...]
>
> So do we do:
>
> UniformRandomProvider restrict(JumpableUniformRandomProvider);
> JumpableUniformRandomProvider restrict(LongJumpableUniformRandomProvider);
> UniformRandomProvider restrict(RestorableUniformRandomProvider);
>
> Or:
>
> UniformRandomProvider unjumpable(JumpableUniformRandomProvider);
> JumpableUniformRandomProvider 
> unlongJumpable(LongJumpableUniformRandomProvider);

I'm a bit hesitant on the spelling...

> UniformRandomProvider unrestorable(RestorableUniformRandomProvider);
>
> The later option only adds two new methods. The first has 3 new methods 
> (deprecating unrestorable with restrict) but suffers from having to cast 
> instances of multiple interfaces to ensure the correct restrict is called.

Oops indeed.
This is too error-prone.

> So this makes me favour the verbosely named option.

+1

Regards,
Gilles

>
> Alex
>

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [rng] Split and Jump functions

2019-05-01 Thread Alex Herbert


> On 1 May 2019, at 16:05, Gilles Sadowski  wrote:
> 
> Hi.
> 
> Le mar. 30 avr. 2019 à 17:08, Alex Herbert  > a écrit :
>> 
>> On 29/04/2019 22:14, Gilles Sadowski wrote:
>>> Hello.
>>> 
>>> Le lun. 29 avr. 2019 à 19:09, Alex Herbert  a 
>>> écrit :
 On 28/04/2019 19:11, Gilles Sadowski wrote:
> Le dim. 28 avr. 2019 à 17:02, Alex Herbert  a 
> écrit :
>> 
>>> On 28 Apr 2019, at 00:59, Bernd Eckenfels  
>>> wrote:
>>> 
>>> Hello,
>>> 
>>> Just a question, I am unclear on the terminology, is „jump“ (did I miss 
>>> the discussion leading toot?) something invented here? It sounds to me 
>>> like this is a generator where the state can be cloned and it is 
>>> „seekable“. It probably makes sense to have those two dimensions 
>>> separated anyway.
>> Hi Bernd, thanks for the input.
>> 
>> This thread started with the definition:
>> Jump:
>> 
>> To create a new instance of the generator that is deterministically 
>> based on the state of the current instance but advanced a set number of 
>> iterations.
>> 
>> 
>> However it is not required to create a new instance at the same time as 
>> jumping. You are correct in that this is two functionalities:
>> 
>> 1. Jump forward in the sequence
>> 2. Copy
>> 
>> However the two are coupled. Having jump on its own is useless (why move 
>> forward in the sequence without using it?). So a copy needs to be 
>> created somewhere before/after the jump.
>> 
>> The idea of a jump is to create a series of the generator at different 
>> points in the state. The generators can be used for parallel 
>> computations and will be ensured to not overlap in their output sequence 
>> for number of outputs skipped by the jump length.
>> 
>> FYI. The generators that support this have jump sizes of 2^64, 96, 128, 
>> 192, 256 and 512. So this is a lot of output sequence to jump.
>> 
>> Copy on its own works but for what purpose? If you want a second 
>> generator at the moment you just create a new one (with a different 
>> seed). Duplicate copies of generators is prone to potential pitfalls 
>> where simulations are not as random as you intend. For a special use 
>> case where you wish to run multiple simulations with the same generator 
>> you can use the Restorable interface to save the state of one and 
>> re-create it in other instances.
>> 
>> The current thread came to the choice of:
>> 
> So the options are (in all cases returning the copy):
> 
> 1. createAndJumpCopy
> 2. copyAndJumpParent
> 3. jumpParentAndCopy
> 4. jump and copy separately
>> Jump and copy separately was ruled out to discourage misuse of copy.
>> 
>> The current suggestion is 1. Create a copy and jump that ahead. The 
>> current instance is not affected.
>> 
>> I now consider this to be weaker for a variety of use cases than 2. This 
>> copies the current state for use and then jumps the parent ahead. So 
>> this alters the state of the parent generator.
>> 
>> Note that all other methods of a generator alter its state. So having 
>> jump alter its state is reasonable.
>> 
>> The most flexible API is to separate jump and copy into two methods. We 
>> can still support helper functions that take in a Jumpable generator and 
>> create a jump series of generators for parallel work. Separating jump 
>> and copy allows the functionality to be used in a larger number of ways 
>> than any other interface that attempts to combine jump and copy.
>> 
>> I am fine with having separate jump and copy. If so the copy method, 
>> being part of the Jumpable interface, will be functionally coupled with 
>> the jump method and should be described in Javadoc with the intended 
>> purpose to use it to copy the parent state either before or after a jump 
>> into a child generator.
>> 
>> As a precursor this API is very flexible:
>> 
>> JumpableUniformRandomProvider extends UniformRandomProvider {
>>  /** Jump and return same instance. */
>>  JumpableUniformRandomProvider jump();
>>  /** Copy the instance. */
>>  JumpableUniformRandomProvider copy();
>> }
>> 
>> Returning the same instance in jump() allows method chaining such as 
>> either:
>> 
>> rng.jump().copy();
>> rng.copy().jump();
>> 
>> This potential pitfall is that a user may do this:
>> 
>> UniformRandomProvider rng1 = rng.copy().jump();
>> UniformRandomProvider rng2 = rng.copy().jump();
>> 
>> Where rng1 and 2 will be the same, 1 jump ahead of the parent state. Or:
>> 
>> UniformRandomProvider rng1 = rng.jump();
>> UniformRandomProvider rng2 = rng.jump();
>> 
>> Where rng, r

Re: [rng] Split and Jump functions

2019-05-01 Thread Gilles Sadowski
Hi.

Le mar. 30 avr. 2019 à 17:08, Alex Herbert  a écrit :
>
> On 29/04/2019 22:14, Gilles Sadowski wrote:
> > Hello.
> >
> > Le lun. 29 avr. 2019 à 19:09, Alex Herbert  a 
> > écrit :
> >> On 28/04/2019 19:11, Gilles Sadowski wrote:
> >>> Le dim. 28 avr. 2019 à 17:02, Alex Herbert  a 
> >>> écrit :
> 
> > On 28 Apr 2019, at 00:59, Bernd Eckenfels  
> > wrote:
> >
> > Hello,
> >
> > Just a question, I am unclear on the terminology, is „jump“ (did I miss 
> > the discussion leading toot?) something invented here? It sounds to me 
> > like this is a generator where the state can be cloned and it is 
> > „seekable“. It probably makes sense to have those two dimensions 
> > separated anyway.
>  Hi Bernd, thanks for the input.
> 
>  This thread started with the definition:
>  Jump:
> 
>  To create a new instance of the generator that is deterministically 
>  based on the state of the current instance but advanced a set number of 
>  iterations.
> 
> 
>  However it is not required to create a new instance at the same time as 
>  jumping. You are correct in that this is two functionalities:
> 
>  1. Jump forward in the sequence
>  2. Copy
> 
>  However the two are coupled. Having jump on its own is useless (why move 
>  forward in the sequence without using it?). So a copy needs to be 
>  created somewhere before/after the jump.
> 
>  The idea of a jump is to create a series of the generator at different 
>  points in the state. The generators can be used for parallel 
>  computations and will be ensured to not overlap in their output sequence 
>  for number of outputs skipped by the jump length.
> 
>  FYI. The generators that support this have jump sizes of 2^64, 96, 128, 
>  192, 256 and 512. So this is a lot of output sequence to jump.
> 
>  Copy on its own works but for what purpose? If you want a second 
>  generator at the moment you just create a new one (with a different 
>  seed). Duplicate copies of generators is prone to potential pitfalls 
>  where simulations are not as random as you intend. For a special use 
>  case where you wish to run multiple simulations with the same generator 
>  you can use the Restorable interface to save the state of one and 
>  re-create it in other instances.
> 
>  The current thread came to the choice of:
> 
> >>> So the options are (in all cases returning the copy):
> >>>
> >>> 1. createAndJumpCopy
> >>> 2. copyAndJumpParent
> >>> 3. jumpParentAndCopy
> >>> 4. jump and copy separately
>  Jump and copy separately was ruled out to discourage misuse of copy.
> 
>  The current suggestion is 1. Create a copy and jump that ahead. The 
>  current instance is not affected.
> 
>  I now consider this to be weaker for a variety of use cases than 2. This 
>  copies the current state for use and then jumps the parent ahead. So 
>  this alters the state of the parent generator.
> 
>  Note that all other methods of a generator alter its state. So having 
>  jump alter its state is reasonable.
> 
>  The most flexible API is to separate jump and copy into two methods. We 
>  can still support helper functions that take in a Jumpable generator and 
>  create a jump series of generators for parallel work. Separating jump 
>  and copy allows the functionality to be used in a larger number of ways 
>  than any other interface that attempts to combine jump and copy.
> 
>  I am fine with having separate jump and copy. If so the copy method, 
>  being part of the Jumpable interface, will be functionally coupled with 
>  the jump method and should be described in Javadoc with the intended 
>  purpose to use it to copy the parent state either before or after a jump 
>  into a child generator.
> 
>  As a precursor this API is very flexible:
> 
>  JumpableUniformRandomProvider extends UniformRandomProvider {
>    /** Jump and return same instance. */
>    JumpableUniformRandomProvider jump();
>    /** Copy the instance. */
>    JumpableUniformRandomProvider copy();
>  }
> 
>  Returning the same instance in jump() allows method chaining such as 
>  either:
> 
>  rng.jump().copy();
>  rng.copy().jump();
> 
>  This potential pitfall is that a user may do this:
> 
>  UniformRandomProvider rng1 = rng.copy().jump();
>  UniformRandomProvider rng2 = rng.copy().jump();
> 
>  Where rng1 and 2 will be the same, 1 jump ahead of the parent state. Or:
> 
>  UniformRandomProvider rng1 = rng.jump();
>  UniformRandomProvider rng2 = rng.jump();
> 
>  Where rng, rng1 and rng2 are the same instance all 2 jumps ahead of the 
>  start point.
> 
>  I think our purpose is to provi

Re: [rng] Split and Jump functions

2019-04-30 Thread Alex Herbert

On 29/04/2019 22:14, Gilles Sadowski wrote:

Hello.

Le lun. 29 avr. 2019 à 19:09, Alex Herbert  a écrit :

On 28/04/2019 19:11, Gilles Sadowski wrote:

Le dim. 28 avr. 2019 à 17:02, Alex Herbert  a écrit :



On 28 Apr 2019, at 00:59, Bernd Eckenfels  wrote:

Hello,

Just a question, I am unclear on the terminology, is „jump“ (did I miss the 
discussion leading toot?) something invented here? It sounds to me like this is 
a generator where the state can be cloned and it is „seekable“. It probably 
makes sense to have those two dimensions separated anyway.

Hi Bernd, thanks for the input.

This thread started with the definition:
Jump:

To create a new instance of the generator that is deterministically based on 
the state of the current instance but advanced a set number of iterations.


However it is not required to create a new instance at the same time as 
jumping. You are correct in that this is two functionalities:

1. Jump forward in the sequence
2. Copy

However the two are coupled. Having jump on its own is useless (why move 
forward in the sequence without using it?). So a copy needs to be created 
somewhere before/after the jump.

The idea of a jump is to create a series of the generator at different points 
in the state. The generators can be used for parallel computations and will be 
ensured to not overlap in their output sequence for number of outputs skipped 
by the jump length.

FYI. The generators that support this have jump sizes of 2^64, 96, 128, 192, 
256 and 512. So this is a lot of output sequence to jump.

Copy on its own works but for what purpose? If you want a second generator at 
the moment you just create a new one (with a different seed). Duplicate copies 
of generators is prone to potential pitfalls where simulations are not as 
random as you intend. For a special use case where you wish to run multiple 
simulations with the same generator you can use the Restorable interface to 
save the state of one and re-create it in other instances.

The current thread came to the choice of:


So the options are (in all cases returning the copy):

1. createAndJumpCopy
2. copyAndJumpParent
3. jumpParentAndCopy
4. jump and copy separately

Jump and copy separately was ruled out to discourage misuse of copy.

The current suggestion is 1. Create a copy and jump that ahead. The current 
instance is not affected.

I now consider this to be weaker for a variety of use cases than 2. This copies 
the current state for use and then jumps the parent ahead. So this alters the 
state of the parent generator.

Note that all other methods of a generator alter its state. So having jump 
alter its state is reasonable.

The most flexible API is to separate jump and copy into two methods. We can 
still support helper functions that take in a Jumpable generator and create a 
jump series of generators for parallel work. Separating jump and copy allows 
the functionality to be used in a larger number of ways than any other 
interface that attempts to combine jump and copy.

I am fine with having separate jump and copy. If so the copy method, being part 
of the Jumpable interface, will be functionally coupled with the jump method 
and should be described in Javadoc with the intended purpose to use it to copy 
the parent state either before or after a jump into a child generator.

As a precursor this API is very flexible:

JumpableUniformRandomProvider extends UniformRandomProvider {
  /** Jump and return same instance. */
  JumpableUniformRandomProvider jump();
  /** Copy the instance. */
  JumpableUniformRandomProvider copy();
}

Returning the same instance in jump() allows method chaining such as either:

rng.jump().copy();
rng.copy().jump();

This potential pitfall is that a user may do this:

UniformRandomProvider rng1 = rng.copy().jump();
UniformRandomProvider rng2 = rng.copy().jump();

Where rng1 and 2 will be the same, 1 jump ahead of the parent state. Or:

UniformRandomProvider rng1 = rng.jump();
UniformRandomProvider rng2 = rng.jump();

Where rng, rng1 and rng2 are the same instance all 2 jumps ahead of the start 
point.

I think our purpose is to provide an API for the generators that can jump and 
that is not too restrictive given the use cases we have so far thought up. 
There may be other ideas of use cases that cannot be done with the coupled 
functionality of:

JumpableUniformRandomProvider extends UniformRandomProvider {
  /** Copy the instance, then jump ahead. Return the copy of the previous 
state. */
  JumpableUniformRandomProvider jump();
}

JumpableUniformRandomProvider extends UniformRandomProvider {
  /** Copy the instance, then jump the copy ahead. Return the copy. The 
current instance is not affected. */
  JumpableUniformRandomProvider jump();
}

So the split functions without allowing method chaining:

JumpableUniformRandomProvider extends UniformRandomProvider {
  /** Jump the current instance ahead. */
  void jump();
  /** Copy the inst

Re: [rng] Split and Jump functions

2019-04-29 Thread Gilles Sadowski
Hello.

Le lun. 29 avr. 2019 à 19:09, Alex Herbert  a écrit :
>
> On 28/04/2019 19:11, Gilles Sadowski wrote:
> > Le dim. 28 avr. 2019 à 17:02, Alex Herbert  a 
> > écrit :
> >>
> >>
> >>> On 28 Apr 2019, at 00:59, Bernd Eckenfels  wrote:
> >>>
> >>> Hello,
> >>>
> >>> Just a question, I am unclear on the terminology, is „jump“ (did I miss 
> >>> the discussion leading toot?) something invented here? It sounds to me 
> >>> like this is a generator where the state can be cloned and it is 
> >>> „seekable“. It probably makes sense to have those two dimensions 
> >>> separated anyway.
> >> Hi Bernd, thanks for the input.
> >>
> >> This thread started with the definition:
> >> Jump:
> >>
> >> To create a new instance of the generator that is deterministically based 
> >> on the state of the current instance but advanced a set number of 
> >> iterations.
> >>
> >>
> >> However it is not required to create a new instance at the same time as 
> >> jumping. You are correct in that this is two functionalities:
> >>
> >> 1. Jump forward in the sequence
> >> 2. Copy
> >>
> >> However the two are coupled. Having jump on its own is useless (why move 
> >> forward in the sequence without using it?). So a copy needs to be created 
> >> somewhere before/after the jump.
> >>
> >> The idea of a jump is to create a series of the generator at different 
> >> points in the state. The generators can be used for parallel computations 
> >> and will be ensured to not overlap in their output sequence for number of 
> >> outputs skipped by the jump length.
> >>
> >> FYI. The generators that support this have jump sizes of 2^64, 96, 128, 
> >> 192, 256 and 512. So this is a lot of output sequence to jump.
> >>
> >> Copy on its own works but for what purpose? If you want a second generator 
> >> at the moment you just create a new one (with a different seed). Duplicate 
> >> copies of generators is prone to potential pitfalls where simulations are 
> >> not as random as you intend. For a special use case where you wish to run 
> >> multiple simulations with the same generator you can use the Restorable 
> >> interface to save the state of one and re-create it in other instances.
> >>
> >> The current thread came to the choice of:
> >>
> > So the options are (in all cases returning the copy):
> >
> > 1. createAndJumpCopy
> > 2. copyAndJumpParent
> > 3. jumpParentAndCopy
> > 4. jump and copy separately
> >> Jump and copy separately was ruled out to discourage misuse of copy.
> >>
> >> The current suggestion is 1. Create a copy and jump that ahead. The 
> >> current instance is not affected.
> >>
> >> I now consider this to be weaker for a variety of use cases than 2. This 
> >> copies the current state for use and then jumps the parent ahead. So this 
> >> alters the state of the parent generator.
> >>
> >> Note that all other methods of a generator alter its state. So having jump 
> >> alter its state is reasonable.
> >>
> >> The most flexible API is to separate jump and copy into two methods. We 
> >> can still support helper functions that take in a Jumpable generator and 
> >> create a jump series of generators for parallel work. Separating jump and 
> >> copy allows the functionality to be used in a larger number of ways than 
> >> any other interface that attempts to combine jump and copy.
> >>
> >> I am fine with having separate jump and copy. If so the copy method, being 
> >> part of the Jumpable interface, will be functionally coupled with the jump 
> >> method and should be described in Javadoc with the intended purpose to use 
> >> it to copy the parent state either before or after a jump into a child 
> >> generator.
> >>
> >> As a precursor this API is very flexible:
> >>
> >> JumpableUniformRandomProvider extends UniformRandomProvider {
> >>  /** Jump and return same instance. */
> >>  JumpableUniformRandomProvider jump();
> >>  /** Copy the instance. */
> >>  JumpableUniformRandomProvider copy();
> >> }
> >>
> >> Returning the same instance in jump() allows method chaining such as 
> >> either:
> >>
> >> rng.jump().copy();
> >> rng.copy().jump();
> >>
> >> This potential pitfall is that a user may do this:
> >>
> >> UniformRandomProvider rng1 = rng.copy().jump();
> >> UniformRandomProvider rng2 = rng.copy().jump();
> >>
> >> Where rng1 and 2 will be the same, 1 jump ahead of the parent state. Or:
> >>
> >> UniformRandomProvider rng1 = rng.jump();
> >> UniformRandomProvider rng2 = rng.jump();
> >>
> >> Where rng, rng1 and rng2 are the same instance all 2 jumps ahead of the 
> >> start point.
> >>
> >> I think our purpose is to provide an API for the generators that can jump 
> >> and that is not too restrictive given the use cases we have so far thought 
> >> up. There may be other ideas of use cases that cannot be done with the 
> >> coupled functionality of:
> >>
> >> JumpableUniformRandomProvider extends UniformRandomProvider {
> >>  /** Copy the instance, then jump ahead. R

Re: [rng] Split and Jump functions

2019-04-29 Thread Alex Herbert

On 28/04/2019 19:11, Gilles Sadowski wrote:

Le dim. 28 avr. 2019 à 17:02, Alex Herbert  a écrit :




On 28 Apr 2019, at 00:59, Bernd Eckenfels  wrote:

Hello,

Just a question, I am unclear on the terminology, is „jump“ (did I miss the 
discussion leading toot?) something invented here? It sounds to me like this is 
a generator where the state can be cloned and it is „seekable“. It probably 
makes sense to have those two dimensions separated anyway.

Hi Bernd, thanks for the input.

This thread started with the definition:
Jump:

To create a new instance of the generator that is deterministically based on 
the state of the current instance but advanced a set number of iterations.


However it is not required to create a new instance at the same time as 
jumping. You are correct in that this is two functionalities:

1. Jump forward in the sequence
2. Copy

However the two are coupled. Having jump on its own is useless (why move 
forward in the sequence without using it?). So a copy needs to be created 
somewhere before/after the jump.

The idea of a jump is to create a series of the generator at different points 
in the state. The generators can be used for parallel computations and will be 
ensured to not overlap in their output sequence for number of outputs skipped 
by the jump length.

FYI. The generators that support this have jump sizes of 2^64, 96, 128, 192, 
256 and 512. So this is a lot of output sequence to jump.

Copy on its own works but for what purpose? If you want a second generator at 
the moment you just create a new one (with a different seed). Duplicate copies 
of generators is prone to potential pitfalls where simulations are not as 
random as you intend. For a special use case where you wish to run multiple 
simulations with the same generator you can use the Restorable interface to 
save the state of one and re-create it in other instances.

The current thread came to the choice of:


So the options are (in all cases returning the copy):

1. createAndJumpCopy
2. copyAndJumpParent
3. jumpParentAndCopy
4. jump and copy separately

Jump and copy separately was ruled out to discourage misuse of copy.

The current suggestion is 1. Create a copy and jump that ahead. The current 
instance is not affected.

I now consider this to be weaker for a variety of use cases than 2. This copies 
the current state for use and then jumps the parent ahead. So this alters the 
state of the parent generator.

Note that all other methods of a generator alter its state. So having jump 
alter its state is reasonable.

The most flexible API is to separate jump and copy into two methods. We can 
still support helper functions that take in a Jumpable generator and create a 
jump series of generators for parallel work. Separating jump and copy allows 
the functionality to be used in a larger number of ways than any other 
interface that attempts to combine jump and copy.

I am fine with having separate jump and copy. If so the copy method, being part 
of the Jumpable interface, will be functionally coupled with the jump method 
and should be described in Javadoc with the intended purpose to use it to copy 
the parent state either before or after a jump into a child generator.

As a precursor this API is very flexible:

JumpableUniformRandomProvider extends UniformRandomProvider {
 /** Jump and return same instance. */
 JumpableUniformRandomProvider jump();
 /** Copy the instance. */
 JumpableUniformRandomProvider copy();
}

Returning the same instance in jump() allows method chaining such as either:

rng.jump().copy();
rng.copy().jump();

This potential pitfall is that a user may do this:

UniformRandomProvider rng1 = rng.copy().jump();
UniformRandomProvider rng2 = rng.copy().jump();

Where rng1 and 2 will be the same, 1 jump ahead of the parent state. Or:

UniformRandomProvider rng1 = rng.jump();
UniformRandomProvider rng2 = rng.jump();

Where rng, rng1 and rng2 are the same instance all 2 jumps ahead of the start 
point.

I think our purpose is to provide an API for the generators that can jump and 
that is not too restrictive given the use cases we have so far thought up. 
There may be other ideas of use cases that cannot be done with the coupled 
functionality of:

JumpableUniformRandomProvider extends UniformRandomProvider {
 /** Copy the instance, then jump ahead. Return the copy of the previous 
state. */
 JumpableUniformRandomProvider jump();
}

JumpableUniformRandomProvider extends UniformRandomProvider {
 /** Copy the instance, then jump the copy ahead. Return the copy. The 
current instance is not affected. */
 JumpableUniformRandomProvider jump();
}

So the split functions without allowing method chaining:

JumpableUniformRandomProvider extends UniformRandomProvider {
 /** Jump the current instance ahead. */
 void jump();
 /** Copy the instance. This is intended to be used either before or after 
a call to jump()
  * to create a series of generators. *

Re: [rng] Split and Jump functions

2019-04-28 Thread Gilles Sadowski
Le dim. 28 avr. 2019 à 17:02, Alex Herbert  a écrit :
>
>
>
> > On 28 Apr 2019, at 00:59, Bernd Eckenfels  wrote:
> >
> > Hello,
> >
> > Just a question, I am unclear on the terminology, is „jump“ (did I miss the 
> > discussion leading toot?) something invented here? It sounds to me like 
> > this is a generator where the state can be cloned and it is „seekable“. It 
> > probably makes sense to have those two dimensions separated anyway.
>
> Hi Bernd, thanks for the input.
>
> This thread started with the definition:
> Jump:
>
> To create a new instance of the generator that is deterministically based on 
> the state of the current instance but advanced a set number of iterations.
>
>
> However it is not required to create a new instance at the same time as 
> jumping. You are correct in that this is two functionalities:
>
> 1. Jump forward in the sequence
> 2. Copy
>
> However the two are coupled. Having jump on its own is useless (why move 
> forward in the sequence without using it?). So a copy needs to be created 
> somewhere before/after the jump.
>
> The idea of a jump is to create a series of the generator at different points 
> in the state. The generators can be used for parallel computations and will 
> be ensured to not overlap in their output sequence for number of outputs 
> skipped by the jump length.
>
> FYI. The generators that support this have jump sizes of 2^64, 96, 128, 192, 
> 256 and 512. So this is a lot of output sequence to jump.
>
> Copy on its own works but for what purpose? If you want a second generator at 
> the moment you just create a new one (with a different seed). Duplicate 
> copies of generators is prone to potential pitfalls where simulations are not 
> as random as you intend. For a special use case where you wish to run 
> multiple simulations with the same generator you can use the Restorable 
> interface to save the state of one and re-create it in other instances.
>
> The current thread came to the choice of:
>
> >>> So the options are (in all cases returning the copy):
> >>>
> >>> 1. createAndJumpCopy
> >>> 2. copyAndJumpParent
> >>> 3. jumpParentAndCopy
> >>> 4. jump and copy separately
>
> Jump and copy separately was ruled out to discourage misuse of copy.
>
> The current suggestion is 1. Create a copy and jump that ahead. The current 
> instance is not affected.
>
> I now consider this to be weaker for a variety of use cases than 2. This 
> copies the current state for use and then jumps the parent ahead. So this 
> alters the state of the parent generator.
>
> Note that all other methods of a generator alter its state. So having jump 
> alter its state is reasonable.
>
> The most flexible API is to separate jump and copy into two methods. We can 
> still support helper functions that take in a Jumpable generator and create a 
> jump series of generators for parallel work. Separating jump and copy allows 
> the functionality to be used in a larger number of ways than any other 
> interface that attempts to combine jump and copy.
>
> I am fine with having separate jump and copy. If so the copy method, being 
> part of the Jumpable interface, will be functionally coupled with the jump 
> method and should be described in Javadoc with the intended purpose to use it 
> to copy the parent state either before or after a jump into a child generator.
>
> As a precursor this API is very flexible:
>
> JumpableUniformRandomProvider extends UniformRandomProvider {
> /** Jump and return same instance. */
> JumpableUniformRandomProvider jump();
> /** Copy the instance. */
> JumpableUniformRandomProvider copy();
> }
>
> Returning the same instance in jump() allows method chaining such as either:
>
> rng.jump().copy();
> rng.copy().jump();
>
> This potential pitfall is that a user may do this:
>
> UniformRandomProvider rng1 = rng.copy().jump();
> UniformRandomProvider rng2 = rng.copy().jump();
>
> Where rng1 and 2 will be the same, 1 jump ahead of the parent state. Or:
>
> UniformRandomProvider rng1 = rng.jump();
> UniformRandomProvider rng2 = rng.jump();
>
> Where rng, rng1 and rng2 are the same instance all 2 jumps ahead of the start 
> point.
>
> I think our purpose is to provide an API for the generators that can jump and 
> that is not too restrictive given the use cases we have so far thought up. 
> There may be other ideas of use cases that cannot be done with the coupled 
> functionality of:
>
> JumpableUniformRandomProvider extends UniformRandomProvider {
> /** Copy the instance, then jump ahead. Return the copy of the previous 
> state. */
> JumpableUniformRandomProvider jump();
> }
>
> JumpableUniformRandomProvider extends UniformRandomProvider {
> /** Copy the instance, then jump the copy ahead. Return the copy. The 
> current instance is not affected. */
> JumpableUniformRandomProvider jump();
> }
>
> So the split functions without allowing method chaining:
>
> JumpableUniformRandomProvider extends UniformRandomProvider {
> /

Re: [rng] Split and Jump functions

2019-04-28 Thread Gilles Sadowski
Hi.

>
> Just a question, I am unclear on the terminology, is „jump“ (did I miss the 
> discussion leading toot?) something invented here?

Not invented here: It's a functionality that exist for some RNG algorithms.

> It sounds to me like this is a generator where the state can be cloned and it 
> is „seekable“. It probably makes sense to have those two dimensions separated 
> anyway.
>
> Gruss
> Bernd
>
>> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [rng] Split and Jump functions

2019-04-28 Thread Alex Herbert
tors. */
JumpableUniformRandomProvider copy();
}

WDYT?

Alex


> 
> Gruss
> Bernd
> 
> 
> --
> http://bernd.eckenfels.net
> 
> 
> Von: Gilles Sadowski 
> Gesendet: Sonntag, April 28, 2019 12:34 AM
> An: Commons Developers List
> Betreff: Re: [rng] Split and Jump functions
> 
> Hello.
> 
>> 
>> 
>>> On 27 Apr 2019, at 14:49, Gilles Sadowski  wrote:
>>> 
>>> Hi.
>>> 
>>> Le sam. 27 avr. 2019 à 15:05, Alex Herbert >> <mailto:alex.d.herb...@gmail.com>> a écrit :
>>>> 
>>>> I have created RNG-97 and RNG-98 for Jump and LongJump.
>>>> 
>>>> Please take a look and comment.
>>>> 
>>>> The documentation highlights the implementation detail that a jump or long 
>>>> jump creates a copy that is far ahead. The original generator is not 
>>>> effected.
>>>> 
>>>> The use case is thus:
>>>> 
>>>> rng1 = …;
>>>> rng2 = rng1.jump();
>>>> rng3 = rng2.jump();
>>>> rng4 = rng3.jump();
>>>> 
>>>> As opposed to:
>>>> 
>>>> rng1 = …;
>>>> rng2 = rng1.jump();
>>>> rng3 = rng1.jump();
>>>> rng4 = rng1.jump();
>>>> 
>>>> Where rng1 will be advanced each time leaving behind a copy generator.
>>>> 
>>>> In either case it will be an overlap problem if any of the children are 
>>>> then used for jumping. So as long as the documentation is clear then this 
>>>> is OK. The helper method to create a jump series (or long jump series) in 
>>>> RandomSource seems the best way to avoid incorrect usage.
>>> 
>>> +1
>>> 
>>> I think that the default should be to prevent a "jump" on the returned
>>> instances.
>>> An overload could be defined with a parameter (e.g. "allowFurtherJump") but 
>>> I'd
>>> leave it out until it is requested based on an actual use-case.
>> 
>> I presume you are talking about the helper method in RandomSource.
>> 
>> However it does open the possibility instead of this:
>> 
>> JumpableUniformRandomProvider {
>> UniformRandomProvider jump();
>> }
>> 
>> This only works if the state is modified for the current instance to allow 
>> chaining jumps.
>> 
>> Having typed all this up into a summary for the two tickets I feel that they 
>> implement the idea in the wrong way. I think the jump should advance the 
>> state of the current generator. This is the master generator created and 
>> used in the high level code that controls the number of jumps that are 
>> required. The returned copy should be a copy of where the generator was. The 
>> copy should not be used for further jumps. In this way the interface for 
>> jump could be made to return a UniformRandomProvider.
>> 
>> When done like that the jumpable RNG is the only thing you need to hold a 
>> reference to. And you can later decide (perhaps dynamically) if you need to 
>> do some more jumps to get another series. Each call to jump moves the master 
>> along and leaves behind a RNG that can be used for a set number of cycles 
>> (the jump length). So you can do:
>> 
>> JumpableUniformRandomProvider rng = …;
>> 
>> UniformRandomProvider[] series1 = RandomSource.createJumpSeries(rng);
>> // Do work with series1 and then maybe
>> UniformRandomProvider[] series2 = RandomSource.createJumpSeries(rng);
>> // Do work with series2, etc
>> UniformRandomProvider[] series3 = RandomSource.createJumpSeries(rng);
>> 
>> Or
>> 
>> JumpableUniformRandomProvider masterRng = …;
>> 
>> ExecutorService executor = Executors.newCachedThreadPool();
>> ArrayList> futures = new ArrayList<>();
>> for (Input input : inputs) {
>> final UniformRandomProvider rng = masterRng.jump();
>> futures.add(executor.submit(new Callable() {
>> // Do something random with rng, then
>> return new Result(...);
>> }));
>> }
>> 
>> The later example uses ‘inputs’ as something where perhaps the size is not 
>> known such as an Iterable or likewise in Java 8 it could be written to 
>> consume a Stream.
> 
> That's a convincing example!
> 
>> Similarly the LongJumpableUniformRandomProvider interface can return a 
>> JumpableUniformRandomProvider so preventing the result from being used for 
>> another long jump but it can be used for (short) jumps.
>> 
>> Have a think on use cases but my feeling is that the interface is more 
>> powerful if you do advance the state and leave copies behind, rather than 
>> creating future copies which must be chained together to create a series.
> 
> OK to change the perspective. ;-)
> 
> Gilles
> 
>> 
>> Alex
>> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
> 



Re: [rng] Split and Jump functions

2019-04-27 Thread Bernd Eckenfels
Hello,

Just a question, I am unclear on the terminology, is „jump“ (did I miss the 
discussion leading toot?) something invented here? It sounds to me like this is 
a generator where the state can be cloned and it is „seekable“. It probably 
makes sense to have those two dimensions separated anyway.

Gruss
Bernd


--
http://bernd.eckenfels.net


Von: Gilles Sadowski 
Gesendet: Sonntag, April 28, 2019 12:34 AM
An: Commons Developers List
Betreff: Re: [rng] Split and Jump functions

Hello.

>
>
> > On 27 Apr 2019, at 14:49, Gilles Sadowski  wrote:
> >
> > Hi.
> >
> > Le sam. 27 avr. 2019 à 15:05, Alex Herbert  > <mailto:alex.d.herb...@gmail.com>> a écrit :
> >>
> >> I have created RNG-97 and RNG-98 for Jump and LongJump.
> >>
> >> Please take a look and comment.
> >>
> >> The documentation highlights the implementation detail that a jump or long 
> >> jump creates a copy that is far ahead. The original generator is not 
> >> effected.
> >>
> >> The use case is thus:
> >>
> >> rng1 = …;
> >> rng2 = rng1.jump();
> >> rng3 = rng2.jump();
> >> rng4 = rng3.jump();
> >>
> >> As opposed to:
> >>
> >> rng1 = …;
> >> rng2 = rng1.jump();
> >> rng3 = rng1.jump();
> >> rng4 = rng1.jump();
> >>
> >> Where rng1 will be advanced each time leaving behind a copy generator.
> >>
> >> In either case it will be an overlap problem if any of the children are 
> >> then used for jumping. So as long as the documentation is clear then this 
> >> is OK. The helper method to create a jump series (or long jump series) in 
> >> RandomSource seems the best way to avoid incorrect usage.
> >
> > +1
> >
> > I think that the default should be to prevent a "jump" on the returned
> > instances.
> > An overload could be defined with a parameter (e.g. "allowFurtherJump") but 
> > I'd
> > leave it out until it is requested based on an actual use-case.
>
> I presume you are talking about the helper method in RandomSource.
>
> However it does open the possibility instead of this:
>
> JumpableUniformRandomProvider {
> UniformRandomProvider jump();
> }
>
> This only works if the state is modified for the current instance to allow 
> chaining jumps.
>
> Having typed all this up into a summary for the two tickets I feel that they 
> implement the idea in the wrong way. I think the jump should advance the 
> state of the current generator. This is the master generator created and used 
> in the high level code that controls the number of jumps that are required. 
> The returned copy should be a copy of where the generator was. The copy 
> should not be used for further jumps. In this way the interface for jump 
> could be made to return a UniformRandomProvider.
>
> When done like that the jumpable RNG is the only thing you need to hold a 
> reference to. And you can later decide (perhaps dynamically) if you need to 
> do some more jumps to get another series. Each call to jump moves the master 
> along and leaves behind a RNG that can be used for a set number of cycles 
> (the jump length). So you can do:
>
> JumpableUniformRandomProvider rng = …;
>
> UniformRandomProvider[] series1 = RandomSource.createJumpSeries(rng);
> // Do work with series1 and then maybe
> UniformRandomProvider[] series2 = RandomSource.createJumpSeries(rng);
> // Do work with series2, etc
> UniformRandomProvider[] series3 = RandomSource.createJumpSeries(rng);
>
> Or
>
> JumpableUniformRandomProvider masterRng = …;
>
> ExecutorService executor = Executors.newCachedThreadPool();
> ArrayList> futures = new ArrayList<>();
> for (Input input : inputs) {
> final UniformRandomProvider rng = masterRng.jump();
> futures.add(executor.submit(new Callable() {
> // Do something random with rng, then
> return new Result(...);
> }));
> }
>
> The later example uses ‘inputs’ as something where perhaps the size is not 
> known such as an Iterable or likewise in Java 8 it could be written to 
> consume a Stream.

That's a convincing example!

> Similarly the LongJumpableUniformRandomProvider interface can return a 
> JumpableUniformRandomProvider so preventing the result from being used for 
> another long jump but it can be used for (short) jumps.
>
> Have a think on use cases but my feeling is that the interface is more 
> powerful if you do advance the state and leave copies behind, rather than 
> creating future copies which must be chained together to create a series.

OK to change the perspective. ;-)

Gilles

>
> Alex
>

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [rng] Split and Jump functions

2019-04-27 Thread Gilles Sadowski
Hello.

>
>
> > On 27 Apr 2019, at 14:49, Gilles Sadowski  wrote:
> >
> > Hi.
> >
> > Le sam. 27 avr. 2019 à 15:05, Alex Herbert  > > a écrit :
> >>
> >> I have created RNG-97 and RNG-98 for Jump and LongJump.
> >>
> >> Please take a look and comment.
> >>
> >> The documentation highlights the implementation detail that a jump or long 
> >> jump creates a copy that is far ahead. The original generator is not 
> >> effected.
> >>
> >> The use case is thus:
> >>
> >> rng1 = …;
> >> rng2 = rng1.jump();
> >> rng3 = rng2.jump();
> >> rng4 = rng3.jump();
> >>
> >> As opposed to:
> >>
> >> rng1 = …;
> >> rng2 = rng1.jump();
> >> rng3 = rng1.jump();
> >> rng4 = rng1.jump();
> >>
> >> Where rng1 will be advanced each time leaving behind a copy generator.
> >>
> >> In either case it will be an overlap problem if any of the children are 
> >> then used for jumping. So as long as the documentation is clear then this 
> >> is OK. The helper method to create a jump series (or long jump series) in 
> >> RandomSource seems the best way to avoid incorrect usage.
> >
> > +1
> >
> > I think that the default should be to prevent a "jump" on the returned
> > instances.
> > An overload could be defined with a parameter (e.g. "allowFurtherJump") but 
> > I'd
> > leave it out until it is requested based on an actual use-case.
>
> I presume you are talking about the helper method in RandomSource.
>
> However it does open the possibility instead of this:
>
> JumpableUniformRandomProvider {
> UniformRandomProvider jump();
> }
>
> This only works if the state is modified for the current instance to allow 
> chaining jumps.
>
> Having typed all this up into a summary for the two tickets I feel that they 
> implement the idea in the wrong way. I think the jump should advance the 
> state of the current generator. This is the master generator created and used 
> in the high level code that controls the number of jumps that are required. 
> The returned copy should be a copy of where the generator was. The copy 
> should not be used for further jumps. In this way the interface for jump 
> could be made to return a UniformRandomProvider.
>
> When done like that the jumpable RNG is the only thing you need to hold a 
> reference to. And you can later decide (perhaps dynamically) if you need to 
> do some more jumps to get another series. Each call to jump moves the master 
> along and leaves behind a RNG that can be used for a set number of cycles 
> (the jump length). So you can do:
>
> JumpableUniformRandomProvider rng = …;
>
> UniformRandomProvider[] series1 = RandomSource.createJumpSeries(rng);
> // Do work with series1 and then maybe
> UniformRandomProvider[] series2 = RandomSource.createJumpSeries(rng);
> // Do work with series2, etc
> UniformRandomProvider[] series3 = RandomSource.createJumpSeries(rng);
>
> Or
>
> JumpableUniformRandomProvider masterRng = …;
>
> ExecutorService executor = Executors.newCachedThreadPool();
> ArrayList> futures = new ArrayList<>();
> for (Input input : inputs) {
> final UniformRandomProvider rng = masterRng.jump();
> futures.add(executor.submit(new Callable() {
> // Do something random with rng, then
> return new Result(...);
> }));
> }
>
> The later example uses ‘inputs’ as something where perhaps the size is not 
> known such as an Iterable or likewise in Java 8 it could be written to 
> consume a Stream.

That's a convincing example!

> Similarly the LongJumpableUniformRandomProvider interface can return a 
> JumpableUniformRandomProvider so preventing the result from being used for 
> another long jump but it can be used for (short) jumps.
>
> Have a think on use cases but my feeling is that the interface is more 
> powerful if you do advance the state and leave copies behind, rather than 
> creating future copies which must be chained together to create a series.

OK to change the perspective. ;-)

Gilles

>
> Alex
>

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [rng] Split and Jump functions

2019-04-27 Thread Alex Herbert


> On 27 Apr 2019, at 14:49, Gilles Sadowski  wrote:
> 
> Hi.
> 
> Le sam. 27 avr. 2019 à 15:05, Alex Herbert  > a écrit :
>> 
>> I have created RNG-97 and RNG-98 for Jump and LongJump.
>> 
>> Please take a look and comment.
>> 
>> The documentation highlights the implementation detail that a jump or long 
>> jump creates a copy that is far ahead. The original generator is not 
>> effected.
>> 
>> The use case is thus:
>> 
>> rng1 = …;
>> rng2 = rng1.jump();
>> rng3 = rng2.jump();
>> rng4 = rng3.jump();
>> 
>> As opposed to:
>> 
>> rng1 = …;
>> rng2 = rng1.jump();
>> rng3 = rng1.jump();
>> rng4 = rng1.jump();
>> 
>> Where rng1 will be advanced each time leaving behind a copy generator.
>> 
>> In either case it will be an overlap problem if any of the children are then 
>> used for jumping. So as long as the documentation is clear then this is OK. 
>> The helper method to create a jump series (or long jump series) in 
>> RandomSource seems the best way to avoid incorrect usage.
> 
> +1
> 
> I think that the default should be to prevent a "jump" on the returned
> instances.
> An overload could be defined with a parameter (e.g. "allowFurtherJump") but 
> I'd
> leave it out until it is requested based on an actual use-case.

I presume you are talking about the helper method in RandomSource.

However it does open the possibility instead of this:

JumpableUniformRandomProvider {
UniformRandomProvider jump();
}

This only works if the state is modified for the current instance to allow 
chaining jumps.

Having typed all this up into a summary for the two tickets I feel that they 
implement the idea in the wrong way. I think the jump should advance the state 
of the current generator. This is the master generator created and used in the 
high level code that controls the number of jumps that are required. The 
returned copy should be a copy of where the generator was. The copy should not 
be used for further jumps. In this way the interface for jump could be made to 
return a UniformRandomProvider.

When done like that the jumpable RNG is the only thing you need to hold a 
reference to. And you can later decide (perhaps dynamically) if you need to do 
some more jumps to get another series. Each call to jump moves the master along 
and leaves behind a RNG that can be used for a set number of cycles (the jump 
length). So you can do:

JumpableUniformRandomProvider rng = …;

UniformRandomProvider[] series1 = RandomSource.createJumpSeries(rng);
// Do work with series1 and then maybe
UniformRandomProvider[] series2 = RandomSource.createJumpSeries(rng);
// Do work with series2, etc
UniformRandomProvider[] series3 = RandomSource.createJumpSeries(rng);

Or

JumpableUniformRandomProvider masterRng = …;

ExecutorService executor = Executors.newCachedThreadPool();
ArrayList> futures = new ArrayList<>();
for (Input input : inputs) {
final UniformRandomProvider rng = masterRng.jump();
futures.add(executor.submit(new Callable() {
// Do something random with rng, then
return new Result(...);
}));
}

The later example uses ‘inputs’ as something where perhaps the size is not 
known such as an Iterable or likewise in Java 8 it could be written to consume 
a Stream.

Similarly the LongJumpableUniformRandomProvider interface can return a 
JumpableUniformRandomProvider so preventing the result from being used for 
another long jump but it can be used for (short) jumps.

Have a think on use cases but my feeling is that the interface is more powerful 
if you do advance the state and leave copies behind, rather than creating 
future copies which must be chained together to create a series.

Alex



> 
> Best,
> Gilles
> 
>> 
>> Alex
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org 
> 
> For additional commands, e-mail: dev-h...@commons.apache.org 
> 


Re: [rng] Split and Jump functions

2019-04-27 Thread Gilles Sadowski
Hi.

Le sam. 27 avr. 2019 à 15:05, Alex Herbert  a écrit :
>
> I have created RNG-97 and RNG-98 for Jump and LongJump.
>
> Please take a look and comment.
>
> The documentation highlights the implementation detail that a jump or long 
> jump creates a copy that is far ahead. The original generator is not effected.
>
> The use case is thus:
>
> rng1 = …;
> rng2 = rng1.jump();
> rng3 = rng2.jump();
> rng4 = rng3.jump();
>
> As opposed to:
>
> rng1 = …;
> rng2 = rng1.jump();
> rng3 = rng1.jump();
> rng4 = rng1.jump();
>
> Where rng1 will be advanced each time leaving behind a copy generator.
>
> In either case it will be an overlap problem if any of the children are then 
> used for jumping. So as long as the documentation is clear then this is OK. 
> The helper method to create a jump series (or long jump series) in 
> RandomSource seems the best way to avoid incorrect usage.

+1

I think that the default should be to prevent a "jump" on the returned
instances.
An overload could be defined with a parameter (e.g. "allowFurtherJump") but I'd
leave it out until it is requested based on an actual use-case.

Best,
Gilles

>
> Alex

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [rng] Split and Jump functions

2019-04-27 Thread Alex Herbert
I have created RNG-97 and RNG-98 for Jump and LongJump.

Please take a look and comment.

The documentation highlights the implementation detail that a jump or long jump 
creates a copy that is far ahead. The original generator is not effected.

The use case is thus:

rng1 = …;
rng2 = rng1.jump();
rng3 = rng2.jump();
rng4 = rng3.jump();

As opposed to:

rng1 = …;
rng2 = rng1.jump();
rng3 = rng1.jump();
rng4 = rng1.jump();

Where rng1 will be advanced each time leaving behind a copy generator.

In either case it will be an overlap problem if any of the children are then 
used for jumping. So as long as the documentation is clear then this is OK. The 
helper method to create a jump series (or long jump series) in RandomSource 
seems the best way to avoid incorrect usage.

Alex




-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [rng] Split and Jump functions

2019-04-19 Thread Alex Herbert


> On 19 Apr 2019, at 15:30, Gilles Sadowski  wrote:
> 
> Le jeu. 18 avr. 2019 à 21:53, Alex Herbert  > a écrit :
>> 
>> 
>> 
>>> On 18 Apr 2019, at 14:12, Gilles Sadowski  wrote:
>>> 
>>> Hello Alex.
>>> 
>> [...]
 
 OK so this results in:
 
 /**
 * Some summary.
 */
 public interface JumpableUniformRandomProvider extends
 UniformRandomProvider {
/**
 * Creates a copy of the UniformRandomProvider and advances the
 state of the copy.
 * The state of the current instance is not altered. The state of
 the copy will be
 * advanced an equivalent of {@code n} sequential calls to a method
 that updates the
 * state of the provider.
 *
 * @return the copy with an advanced state
 */
JumpableUniformRandomProvider jump();
 }
 
>>> 
>>> +1
>>> [Clean and lean: and no side-effects to explain...]
>>> 
 
 This can be documented in an implementation as:
 
 public class MyJumpingRNG implements JumpableUniformRandomProvider {
/**
 * {@inheritDoc}
 *
 * The jump size {@code n} is the equivalent of {@code 2^64}
 calls to
 * {@link UniformRandomProvider#nextLong() nextLong()}.
 */
@Override
public JumpableUniformRandomProvider jump() {
// TODO Auto-generated method stub
return null;
}
 }
>>> 
>>> +1
>>> 
 
 Do we add a second interface for LongJumpableUniformRandomProvider?
>>> 
>>> Sure, if the functionality is provided by some of the algorithms implemented
>>> in [RNG].
>>> But let's have the two functionalities in separate commits.
>>> 
 
>> So the options are (in all cases returning the copy):
>> 
>> 1. createAndJumpCopy
>> 2. copyAndJumpParent
>> 3. jumpParentAndCopy
>> 4. jump and copy separately
>> 
>> 1. Your preferred option. A copy of the state is made. The state is 
>> advanced in the copy and returned. But when called repeatedly it will 
>> get the same generator and code must be organised appropriately.
> We could provide a convenience method in  "RandomSource":
> 
> public UniformRandomProvider[] jump(int n,
> JumpableUniformRandomProvider parent) {
>final UniformRandomProvider[] rngs = new UniformRandomProvider[n];
>UniformRandomProvider tmp = parent;
>for (int i = 0; i < n; i++) {
>rngs[i] = restrict(tmp);
>tmp = tmp.jump();
>}
>return rngs;
> }
 
 +1. Remove the need for the user to repeat boiler plate code.
 
 Same sort of idea of longJump() too.
>>> 
>>> +1
>>> 
>> It is not actually possible to jump forward a single instance. Only 
>> children are advanced.
> A feature: There is only one way to alter the state of an instance
> (i.e. a call to "next()").
 OK.
>>> 
>>> Great. :-)
>>> 
>>> Gilles
>> 
>> This sounds like a resolution. I will put the ideas into a Jira ticket for 
>> Jumpable.
> 
> Thanks.
> 
>> 
>> I am a bit busy at the moment with other mini-projects (updates to 
>> nextInt(int) being the main one, Poisson samplers (again) being another 
>> leading to a family of log normal based distributions that may be supported 
>> using cumulative probability look-up tables) but will hope to get this done 
>> soon. The actual implementation should be quite easy.
>> 
>> Here’s one for you to think about on the subject of Jumpable. What about 
>> support for the generators that can be advanced by a user specified 
>> increment? For example the SplitMix algorithm is based on a sequence and so 
>> can be advanced from 1 to 2^64-1 steps. It does seem strange to support this 
>> (if we add jumpable to SplitMix) using only one specific jump distance. A 
>> Skippable can do this:
>> 
>> SkippableURP {
>>public SkippableURP skip(long steps);
>> 
>>// or
>> 
>>public SkippableURP skipPower2(long power);
>> }
>> 
>> Too much?
> 
> You read my mind. ;-)
> What would be the uses of "short" jumps (i.e. having small
> non-overlapping sequences

I could not think of one. Just wanted to raise the suggestion. It would not be 
supported for many generators anyway (currently only SplitMix, maybe more in 
the future).

> from many instances, rather than a longer one from a single instance)?
> IIUC, the hard-coded jump sizes in existing implementations seem a compromise,
> based on the number of potentially concurrent threads, or independent
> simulations.
> Increasing that number does not seem necessary for the mid-term.
> 
> Regards,
> Gilles
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org 
> 
> For additional commands, e-mail: dev-h...@commons.apache.org 
> 


Re: [rng] Split and Jump functions

2019-04-19 Thread Gilles Sadowski
Le jeu. 18 avr. 2019 à 21:53, Alex Herbert  a écrit :
>
>
>
> > On 18 Apr 2019, at 14:12, Gilles Sadowski  wrote:
> >
> > Hello Alex.
> >
>  [...]
> >>
> >> OK so this results in:
> >>
> >> /**
> >>  * Some summary.
> >>  */
> >> public interface JumpableUniformRandomProvider extends
> >> UniformRandomProvider {
> >> /**
> >>  * Creates a copy of the UniformRandomProvider and advances the
> >> state of the copy.
> >>  * The state of the current instance is not altered. The state of
> >> the copy will be
> >>  * advanced an equivalent of {@code n} sequential calls to a method
> >> that updates the
> >>  * state of the provider.
> >>  *
> >>  * @return the copy with an advanced state
> >>  */
> >> JumpableUniformRandomProvider jump();
> >> }
> >>
> >
> > +1
> > [Clean and lean: and no side-effects to explain...]
> >
> >>
> >> This can be documented in an implementation as:
> >>
> >> public class MyJumpingRNG implements JumpableUniformRandomProvider {
> >> /**
> >>  * {@inheritDoc}
> >>  *
> >>  * The jump size {@code n} is the equivalent of {@code 2^64}
> >> calls to
> >>  * {@link UniformRandomProvider#nextLong() nextLong()}.
> >>  */
> >> @Override
> >> public JumpableUniformRandomProvider jump() {
> >> // TODO Auto-generated method stub
> >> return null;
> >> }
> >> }
> >
> > +1
> >
> >>
> >> Do we add a second interface for LongJumpableUniformRandomProvider?
> >
> > Sure, if the functionality is provided by some of the algorithms implemented
> > in [RNG].
> > But let's have the two functionalities in separate commits.
> >
> >>
>  So the options are (in all cases returning the copy):
> 
>  1. createAndJumpCopy
>  2. copyAndJumpParent
>  3. jumpParentAndCopy
>  4. jump and copy separately
> 
>  1. Your preferred option. A copy of the state is made. The state is 
>  advanced in the copy and returned. But when called repeatedly it will 
>  get the same generator and code must be organised appropriately.
> >>> We could provide a convenience method in  "RandomSource":
> >>>
> >>> public UniformRandomProvider[] jump(int n,
> >>> JumpableUniformRandomProvider parent) {
> >>> final UniformRandomProvider[] rngs = new UniformRandomProvider[n];
> >>> UniformRandomProvider tmp = parent;
> >>> for (int i = 0; i < n; i++) {
> >>> rngs[i] = restrict(tmp);
> >>> tmp = tmp.jump();
> >>> }
> >>> return rngs;
> >>> }
> >>
> >> +1. Remove the need for the user to repeat boiler plate code.
> >>
> >> Same sort of idea of longJump() too.
> >
> > +1
> >
>  It is not actually possible to jump forward a single instance. Only 
>  children are advanced.
> >>> A feature: There is only one way to alter the state of an instance
> >>> (i.e. a call to "next()").
> >> OK.
> >
> > Great. :-)
> >
> > Gilles
>
> This sounds like a resolution. I will put the ideas into a Jira ticket for 
> Jumpable.

Thanks.

>
> I am a bit busy at the moment with other mini-projects (updates to 
> nextInt(int) being the main one, Poisson samplers (again) being another 
> leading to a family of log normal based distributions that may be supported 
> using cumulative probability look-up tables) but will hope to get this done 
> soon. The actual implementation should be quite easy.
>
> Here’s one for you to think about on the subject of Jumpable. What about 
> support for the generators that can be advanced by a user specified 
> increment? For example the SplitMix algorithm is based on a sequence and so 
> can be advanced from 1 to 2^64-1 steps. It does seem strange to support this 
> (if we add jumpable to SplitMix) using only one specific jump distance. A 
> Skippable can do this:
>
> SkippableURP {
> public SkippableURP skip(long steps);
>
> // or
>
> public SkippableURP skipPower2(long power);
> }
>
> Too much?

You read my mind. ;-)
What would be the uses of "short" jumps (i.e. having small
non-overlapping sequences
from many instances, rather than a longer one from a single instance)?
IIUC, the hard-coded jump sizes in existing implementations seem a compromise,
based on the number of potentially concurrent threads, or independent
simulations.
Increasing that number does not seem necessary for the mid-term.

Regards,
Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [rng] Split and Jump functions

2019-04-18 Thread Alex Herbert



> On 18 Apr 2019, at 14:12, Gilles Sadowski  wrote:
> 
> Hello Alex.
> 
 [...]
>> 
>> OK so this results in:
>> 
>> /**
>>  * Some summary.
>>  */
>> public interface JumpableUniformRandomProvider extends
>> UniformRandomProvider {
>> /**
>>  * Creates a copy of the UniformRandomProvider and advances the
>> state of the copy.
>>  * The state of the current instance is not altered. The state of
>> the copy will be
>>  * advanced an equivalent of {@code n} sequential calls to a method
>> that updates the
>>  * state of the provider.
>>  *
>>  * @return the copy with an advanced state
>>  */
>> JumpableUniformRandomProvider jump();
>> }
>> 
> 
> +1
> [Clean and lean: and no side-effects to explain...]
> 
>> 
>> This can be documented in an implementation as:
>> 
>> public class MyJumpingRNG implements JumpableUniformRandomProvider {
>> /**
>>  * {@inheritDoc}
>>  *
>>  * The jump size {@code n} is the equivalent of {@code 2^64}
>> calls to
>>  * {@link UniformRandomProvider#nextLong() nextLong()}.
>>  */
>> @Override
>> public JumpableUniformRandomProvider jump() {
>> // TODO Auto-generated method stub
>> return null;
>> }
>> }
> 
> +1
> 
>> 
>> Do we add a second interface for LongJumpableUniformRandomProvider?
> 
> Sure, if the functionality is provided by some of the algorithms implemented
> in [RNG].
> But let's have the two functionalities in separate commits.
> 
>> 
 So the options are (in all cases returning the copy):
 
 1. createAndJumpCopy
 2. copyAndJumpParent
 3. jumpParentAndCopy
 4. jump and copy separately
 
 1. Your preferred option. A copy of the state is made. The state is 
 advanced in the copy and returned. But when called repeatedly it will get 
 the same generator and code must be organised appropriately.
>>> We could provide a convenience method in  "RandomSource":
>>> 
>>> public UniformRandomProvider[] jump(int n,
>>> JumpableUniformRandomProvider parent) {
>>> final UniformRandomProvider[] rngs = new UniformRandomProvider[n];
>>> UniformRandomProvider tmp = parent;
>>> for (int i = 0; i < n; i++) {
>>> rngs[i] = restrict(tmp);
>>> tmp = tmp.jump();
>>> }
>>> return rngs;
>>> }
>> 
>> +1. Remove the need for the user to repeat boiler plate code.
>> 
>> Same sort of idea of longJump() too.
> 
> +1
> 
 It is not actually possible to jump forward a single instance. Only 
 children are advanced.
>>> A feature: There is only one way to alter the state of an instance
>>> (i.e. a call to "next()").
>> OK.
> 
> Great. :-)
> 
> Gilles

This sounds like a resolution. I will put the ideas into a Jira ticket for 
Jumpable.

I am a bit busy at the moment with other mini-projects (updates to nextInt(int) 
being the main one, Poisson samplers (again) being another leading to a family 
of log normal based distributions that may be supported using cumulative 
probability look-up tables) but will hope to get this done soon. The actual 
implementation should be quite easy.

Here’s one for you to think about on the subject of Jumpable. What about 
support for the generators that can be advanced by a user specified increment? 
For example the SplitMix algorithm is based on a sequence and so can be 
advanced from 1 to 2^64-1 steps. It does seem strange to support this (if we 
add jumpable to SplitMix) using only one specific jump distance. A Skippable 
can do this:

SkippableURP {
public SkippableURP skip(long steps);

// or

public SkippableURP skipPower2(long power);
}

Too much?

> 
>>> 
 [...]
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [rng] Split and Jump functions

2019-04-18 Thread Gilles Sadowski
Hello Alex.

>>> [...]
>
> OK so this results in:
>
> /**
>   * Some summary.
>   */
> public interface JumpableUniformRandomProvider extends
> UniformRandomProvider {
>  /**
>   * Creates a copy of the UniformRandomProvider and advances the
> state of the copy.
>   * The state of the current instance is not altered. The state of
> the copy will be
>   * advanced an equivalent of {@code n} sequential calls to a method
> that updates the
>   * state of the provider.
>   *
>   * @return the copy with an advanced state
>   */
>  JumpableUniformRandomProvider jump();
> }
>

+1
[Clean and lean: and no side-effects to explain...]

>
> This can be documented in an implementation as:
>
> public class MyJumpingRNG implements JumpableUniformRandomProvider {
>  /**
>   * {@inheritDoc}
>   *
>   * The jump size {@code n} is the equivalent of {@code 2^64}
> calls to
>   * {@link UniformRandomProvider#nextLong() nextLong()}.
>   */
>  @Override
>  public JumpableUniformRandomProvider jump() {
>  // TODO Auto-generated method stub
>  return null;
>  }
> }

+1

>
> Do we add a second interface for LongJumpableUniformRandomProvider?

Sure, if the functionality is provided by some of the algorithms implemented
in [RNG].
But let's have the two functionalities in separate commits.

>
> >> So the options are (in all cases returning the copy):
> >>
> >> 1. createAndJumpCopy
> >> 2. copyAndJumpParent
> >> 3. jumpParentAndCopy
> >> 4. jump and copy separately
> >>
> >> 1. Your preferred option. A copy of the state is made. The state is 
> >> advanced in the copy and returned. But when called repeatedly it will get 
> >> the same generator and code must be organised appropriately.
> > We could provide a convenience method in  "RandomSource":
> >
> > public UniformRandomProvider[] jump(int n,
> > JumpableUniformRandomProvider parent) {
> >  final UniformRandomProvider[] rngs = new UniformRandomProvider[n];
> >  UniformRandomProvider tmp = parent;
> >  for (int i = 0; i < n; i++) {
> >  rngs[i] = restrict(tmp);
> >  tmp = tmp.jump();
> >  }
> >  return rngs;
> > }
>
> +1. Remove the need for the user to repeat boiler plate code.
>
> Same sort of idea of longJump() too.

+1

> >> It is not actually possible to jump forward a single instance. Only 
> >> children are advanced.
> > A feature: There is only one way to alter the state of an instance
> > (i.e. a call to "next()").
> OK.

Great. :-)

Gilles

> >
>>> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [rng] Split and Jump functions

2019-04-17 Thread Alex Herbert



On 17/04/2019 16:43, Gilles Sadowski wrote:

Hello.

Le lun. 15 avr. 2019 à 01:03, Alex Herbert  a écrit :




On 14 Apr 2019, at 01:31, Gilles Sadowski  wrote:

Hello.


On 11/04/2019 13:22, Gilles Sadowski wrote:

[...]

Not adding a dedicated method would mean everyone has to do this:

JumpableUniformRandomProvider rng = (JumpableUniformRandomProvider) 
RandomSource.create(…)

But adding a mirror methods:

JumpableUniformRandomProvider RandomSource::createJumpable(…)
LongJumpableUniformRandomProvider RandomSource::createLongJumpable(…)

Does seem to add clutter to RandomSource. If we leave it out for now it can be 
added in the future.

+1 (for leaving out).


Is there scope for needing the Jumpable to be detected through the API? E.g. 
add:

boolean RandomSource::isJumpable(RandomSource)

We would just need to maintain an EnumSet for Jumpable and likewise for 
LongJumpable. Again more clutter to the RandomSource interface but perhaps less 
so than a mirror create method that would throw IllegalArgumentException for 
any RandomSource specified that is not Jumpable.

+1 (for leaving out for now).


[...]

I was hoping to avoid creating a duplicate class. But actually that
would be fine and easier for testing. The implementation is trivial anyway.

Why duplicate?
IIUC, shouldn't the existing "core" class define an additional constructor
that accepts the "K" argument.  Then the current "SPLIT_MIX_64"
would use the default increment.
[Same as with "TWO_CMRES" and "TWO_CMRES_SELECT", no?.]

OK. That was where I should have gone to start with. I’ll do that.


I've just finished an initial implementation of the MSWS RNG that uses a
self-generated Weyl sequence. It works. So the idea for this can be
applied to SPLIT_MIX_64_K by using the same Weyl sequence generation in
both. Perhaps moving the code to create the Weyl sequence increment to
NumberFactory.

+1

OK. So I created a Jira ticket for the SPLIT_MIX_64_K. I’m not sure when I’ll 
get around to doing this though. It seems less important than other tasks. It 
may even be redundant if the only reason is to increase the state space for the 
generator. Each generator would still only have a period of 2^64. You can just 
create a lot of them with a weak assurance they will not overlap and that the 
uniformity is statistically the same. Since there are of the order of 2^63 
variants we are not going to be able to test this and would have to rely on the 
theory behind it.

If we add Jumpable to SplitMix with jumps of 2^32 and 2^48 then you can create 
either 2^32 rngs with no overlap for the first 2^32 output numbers or 2^16 rngs 
that can each be jumped 2^16 times with no overlap for the first 2^32 output 
numbers. That is a lot for a small parallel situation and does have the 
assurance of no overlap. Any parallel usage where longer sequences are expected 
to be used can use one of the XorShiRo generators.

I'm a bit lost here.  I thought that "SplitMix64" did not provide
"jump",

The state it is a linear sum. So can be jumped very easily.

y = mx + c

c is the current Weyl state, m the number of transitions and x the Weyl
increment. So we can make SplitMix Jumpable. The maximum jump length is
2^64 which will wrap the state.


hence the
"Weyl way" to ensure no-overlap, with high probability (why "weak assurance"?)

I used the term "weak assurance" as opposed to "concrete assurance". The
wording used by the JDK is "with very high probability, the set of
values collectively generated by the two objects has the same
statistical properties as if the same quantity of values were generated
by a single object". It is a bit of semantics.


Anyways, I agree that SPLIT_MIX_64_K is low priority, but especially if we
cannot ensure that the added flexibility does not come with unsuspected
drawbacks (?).  [IIRC, the article about "TwoCmres" reported experiments for
choosing the (hard-coded) values that do produce good sequences.]

The likelihood of overlap is 1/[the number of different Weyl sequences].
This is around 2^-63. So it is small. However the base implementation
for SplitMix uses a specific Weyl sequence increment based on the golden
ratio. I do not know the original of this. For example is it better that
any other irrational number converted to an integer, e.g. pi, Euler's
number (e), sqrt(2)? The irrational number comes from the origin of the
Weyl sequence which is based on irrationals. But there may be no reason
that the golden ratio was chosen rather than any other source irrational.

Does the non-overlap weak assurance take into account the fact that
there are no irrational numbers in a truncated representation (like the
one stored in a computer)?

I would say that the 'weak assurance'/'very unlikely' statements allude to the 
fact that no-one has actually tested all the 2^63 different Weyl sequences 
after they pass through the hash mixing function to see if they output the same 
sequence as any other, or are weaker as random generators. It would be quite 

Re: [rng] Split and Jump functions

2019-04-17 Thread Gilles Sadowski
Hello.

Le lun. 15 avr. 2019 à 01:03, Alex Herbert  a écrit :
>
>
>
> > On 14 Apr 2019, at 01:31, Gilles Sadowski  wrote:
> >
> > Hello.
> >
> >> On 11/04/2019 13:22, Gilles Sadowski wrote:
> > [...]
>  Not adding a dedicated method would mean everyone has to do this:
> 
>  JumpableUniformRandomProvider rng = (JumpableUniformRandomProvider) 
>  RandomSource.create(…)
> 
>  But adding a mirror methods:
> 
>  JumpableUniformRandomProvider RandomSource::createJumpable(…)
>  LongJumpableUniformRandomProvider RandomSource::createLongJumpable(…)
> 
>  Does seem to add clutter to RandomSource. If we leave it out for now it 
>  can be added in the future.
> >>> +1 (for leaving out).
> >>>
>  Is there scope for needing the Jumpable to be detected through the API? 
>  E.g. add:
> 
>  boolean RandomSource::isJumpable(RandomSource)
> 
>  We would just need to maintain an EnumSet for Jumpable and likewise for 
>  LongJumpable. Again more clutter to the RandomSource interface but 
>  perhaps less so than a mirror create method that would throw 
>  IllegalArgumentException for any RandomSource specified that is not 
>  Jumpable.
> >>> +1 (for leaving out for now).
> >>>
> > [...]
> >> I was hoping to avoid creating a duplicate class. But actually that
> >> would be fine and easier for testing. The implementation is trivial 
> >> anyway.
> > Why duplicate?
> > IIUC, shouldn't the existing "core" class define an additional 
> > constructor
> > that accepts the "K" argument.  Then the current "SPLIT_MIX_64"
> > would use the default increment.
> > [Same as with "TWO_CMRES" and "TWO_CMRES_SELECT", no?.]
>  OK. That was where I should have gone to start with. I’ll do that.
> 
> >> I've just finished an initial implementation of the MSWS RNG that uses 
> >> a
> >> self-generated Weyl sequence. It works. So the idea for this can be
> >> applied to SPLIT_MIX_64_K by using the same Weyl sequence generation in
> >> both. Perhaps moving the code to create the Weyl sequence increment to
> >> NumberFactory.
> > +1
>  OK. So I created a Jira ticket for the SPLIT_MIX_64_K. I’m not sure when 
>  I’ll get around to doing this though. It seems less important than other 
>  tasks. It may even be redundant if the only reason is to increase the 
>  state space for the generator. Each generator would still only have a 
>  period of 2^64. You can just create a lot of them with a weak assurance 
>  they will not overlap and that the uniformity is statistically the same. 
>  Since there are of the order of 2^63 variants we are not going to be 
>  able to test this and would have to rely on the theory behind it.
> 
>  If we add Jumpable to SplitMix with jumps of 2^32 and 2^48 then you can 
>  create either 2^32 rngs with no overlap for the first 2^32 output 
>  numbers or 2^16 rngs that can each be jumped 2^16 times with no overlap 
>  for the first 2^32 output numbers. That is a lot for a small parallel 
>  situation and does have the assurance of no overlap. Any parallel usage 
>  where longer sequences are expected to be used can use one of the 
>  XorShiRo generators.
> >>> I'm a bit lost here.  I thought that "SplitMix64" did not provide
> >>> "jump",
> >>
> >> The state it is a linear sum. So can be jumped very easily.
> >>
> >> y = mx + c
> >>
> >> c is the current Weyl state, m the number of transitions and x the Weyl
> >> increment. So we can make SplitMix Jumpable. The maximum jump length is
> >> 2^64 which will wrap the state.
> >>
> >>> hence the
> >>> "Weyl way" to ensure no-overlap, with high probability (why "weak 
> >>> assurance"?)
> >> I used the term "weak assurance" as opposed to "concrete assurance". The
> >> wording used by the JDK is "with very high probability, the set of
> >> values collectively generated by the two objects has the same
> >> statistical properties as if the same quantity of values were generated
> >> by a single object". It is a bit of semantics.
> >>
> >>>
> >>> Anyways, I agree that SPLIT_MIX_64_K is low priority, but especially if we
> >>> cannot ensure that the added flexibility does not come with unsuspected
> >>> drawbacks (?).  [IIRC, the article about "TwoCmres" reported experiments 
> >>> for
> >>> choosing the (hard-coded) values that do produce good sequences.]
> >>
> >> The likelihood of overlap is 1/[the number of different Weyl sequences].
> >> This is around 2^-63. So it is small. However the base implementation
> >> for SplitMix uses a specific Weyl sequence increment based on the golden
> >> ratio. I do not know the original of this. For example is it better that
> >> any other irrational number converted to an integer, e.g. pi, Euler's
> >> number (e), sqrt(2)? The irrational number comes from the origin of the
> >> Weyl sequence which is based on irrationals.

Re: [rng] Split and Jump functions

2019-04-14 Thread Alex Herbert


> On 14 Apr 2019, at 01:31, Gilles Sadowski  wrote:
> 
> Hello.
> 
>> On 11/04/2019 13:22, Gilles Sadowski wrote:
> [...]
 Not adding a dedicated method would mean everyone has to do this:
 
 JumpableUniformRandomProvider rng = (JumpableUniformRandomProvider) 
 RandomSource.create(…)
 
 But adding a mirror methods:
 
 JumpableUniformRandomProvider RandomSource::createJumpable(…)
 LongJumpableUniformRandomProvider RandomSource::createLongJumpable(…)
 
 Does seem to add clutter to RandomSource. If we leave it out for now it 
 can be added in the future.
>>> +1 (for leaving out).
>>> 
 Is there scope for needing the Jumpable to be detected through the API? 
 E.g. add:
 
 boolean RandomSource::isJumpable(RandomSource)
 
 We would just need to maintain an EnumSet for Jumpable and likewise for 
 LongJumpable. Again more clutter to the RandomSource interface but perhaps 
 less so than a mirror create method that would throw 
 IllegalArgumentException for any RandomSource specified that is not 
 Jumpable.
>>> +1 (for leaving out for now).
>>> 
> [...]
>> I was hoping to avoid creating a duplicate class. But actually that
>> would be fine and easier for testing. The implementation is trivial 
>> anyway.
> Why duplicate?
> IIUC, shouldn't the existing "core" class define an additional constructor
> that accepts the "K" argument.  Then the current "SPLIT_MIX_64"
> would use the default increment.
> [Same as with "TWO_CMRES" and "TWO_CMRES_SELECT", no?.]
 OK. That was where I should have gone to start with. I’ll do that.
 
>> I've just finished an initial implementation of the MSWS RNG that uses a
>> self-generated Weyl sequence. It works. So the idea for this can be
>> applied to SPLIT_MIX_64_K by using the same Weyl sequence generation in
>> both. Perhaps moving the code to create the Weyl sequence increment to
>> NumberFactory.
> +1
 OK. So I created a Jira ticket for the SPLIT_MIX_64_K. I’m not sure when 
 I’ll get around to doing this though. It seems less important than other 
 tasks. It may even be redundant if the only reason is to increase the 
 state space for the generator. Each generator would still only have a 
 period of 2^64. You can just create a lot of them with a weak assurance 
 they will not overlap and that the uniformity is statistically the same. 
 Since there are of the order of 2^63 variants we are not going to be able 
 to test this and would have to rely on the theory behind it.
 
 If we add Jumpable to SplitMix with jumps of 2^32 and 2^48 then you can 
 create either 2^32 rngs with no overlap for the first 2^32 output numbers 
 or 2^16 rngs that can each be jumped 2^16 times with no overlap for the 
 first 2^32 output numbers. That is a lot for a small parallel situation 
 and does have the assurance of no overlap. Any parallel usage where longer 
 sequences are expected to be used can use one of the XorShiRo generators.
>>> I'm a bit lost here.  I thought that "SplitMix64" did not provide
>>> "jump",
>> 
>> The state it is a linear sum. So can be jumped very easily.
>> 
>> y = mx + c
>> 
>> c is the current Weyl state, m the number of transitions and x the Weyl
>> increment. So we can make SplitMix Jumpable. The maximum jump length is
>> 2^64 which will wrap the state.
>> 
>>> hence the
>>> "Weyl way" to ensure no-overlap, with high probability (why "weak 
>>> assurance"?)
>> I used the term "weak assurance" as opposed to "concrete assurance". The
>> wording used by the JDK is "with very high probability, the set of
>> values collectively generated by the two objects has the same
>> statistical properties as if the same quantity of values were generated
>> by a single object". It is a bit of semantics.
>> 
>>> 
>>> Anyways, I agree that SPLIT_MIX_64_K is low priority, but especially if we
>>> cannot ensure that the added flexibility does not come with unsuspected
>>> drawbacks (?).  [IIRC, the article about "TwoCmres" reported experiments for
>>> choosing the (hard-coded) values that do produce good sequences.]
>> 
>> The likelihood of overlap is 1/[the number of different Weyl sequences].
>> This is around 2^-63. So it is small. However the base implementation
>> for SplitMix uses a specific Weyl sequence increment based on the golden
>> ratio. I do not know the original of this. For example is it better that
>> any other irrational number converted to an integer, e.g. pi, Euler's
>> number (e), sqrt(2)? The irrational number comes from the origin of the
>> Weyl sequence which is based on irrationals. But there may be no reason
>> that the golden ratio was chosen rather than any other source irrational.
> 
> Does the non-overlap weak assurance take into account the fact that
> there are no irrational numbers in a truncated representation (like the
> one stored in a compu

Re: [rng] Split and Jump functions

2019-04-13 Thread Gilles Sadowski
Hello.

> On 11/04/2019 13:22, Gilles Sadowski wrote:
> >>> [...]
> >> Not adding a dedicated method would mean everyone has to do this:
> >>
> >> JumpableUniformRandomProvider rng = (JumpableUniformRandomProvider) 
> >> RandomSource.create(…)
> >>
> >> But adding a mirror methods:
> >>
> >> JumpableUniformRandomProvider RandomSource::createJumpable(…)
> >> LongJumpableUniformRandomProvider RandomSource::createLongJumpable(…)
> >>
> >> Does seem to add clutter to RandomSource. If we leave it out for now it 
> >> can be added in the future.
> > +1 (for leaving out).
> >
> >> Is there scope for needing the Jumpable to be detected through the API? 
> >> E.g. add:
> >>
> >> boolean RandomSource::isJumpable(RandomSource)
> >>
> >> We would just need to maintain an EnumSet for Jumpable and likewise for 
> >> LongJumpable. Again more clutter to the RandomSource interface but perhaps 
> >> less so than a mirror create method that would throw 
> >> IllegalArgumentException for any RandomSource specified that is not 
> >> Jumpable.
> > +1 (for leaving out for now).
> >
> >>> [...]
>  I was hoping to avoid creating a duplicate class. But actually that
>  would be fine and easier for testing. The implementation is trivial 
>  anyway.
> >>> Why duplicate?
> >>> IIUC, shouldn't the existing "core" class define an additional constructor
> >>> that accepts the "K" argument.  Then the current "SPLIT_MIX_64"
> >>> would use the default increment.
> >>> [Same as with "TWO_CMRES" and "TWO_CMRES_SELECT", no?.]
> >> OK. That was where I should have gone to start with. I’ll do that.
> >>
>  I've just finished an initial implementation of the MSWS RNG that uses a
>  self-generated Weyl sequence. It works. So the idea for this can be
>  applied to SPLIT_MIX_64_K by using the same Weyl sequence generation in
>  both. Perhaps moving the code to create the Weyl sequence increment to
>  NumberFactory.
> >>> +1
> >> OK. So I created a Jira ticket for the SPLIT_MIX_64_K. I’m not sure when 
> >> I’ll get around to doing this though. It seems less important than other 
> >> tasks. It may even be redundant if the only reason is to increase the 
> >> state space for the generator. Each generator would still only have a 
> >> period of 2^64. You can just create a lot of them with a weak assurance 
> >> they will not overlap and that the uniformity is statistically the same. 
> >> Since there are of the order of 2^63 variants we are not going to be able 
> >> to test this and would have to rely on the theory behind it.
> >>
> >> If we add Jumpable to SplitMix with jumps of 2^32 and 2^48 then you can 
> >> create either 2^32 rngs with no overlap for the first 2^32 output numbers 
> >> or 2^16 rngs that can each be jumped 2^16 times with no overlap for the 
> >> first 2^32 output numbers. That is a lot for a small parallel situation 
> >> and does have the assurance of no overlap. Any parallel usage where longer 
> >> sequences are expected to be used can use one of the XorShiRo generators.
> > I'm a bit lost here.  I thought that "SplitMix64" did not provide
> > "jump",
>
> The state it is a linear sum. So can be jumped very easily.
>
> y = mx + c
>
> c is the current Weyl state, m the number of transitions and x the Weyl
> increment. So we can make SplitMix Jumpable. The maximum jump length is
> 2^64 which will wrap the state.
>
> > hence the
> > "Weyl way" to ensure no-overlap, with high probability (why "weak 
> > assurance"?)
> I used the term "weak assurance" as opposed to "concrete assurance". The
> wording used by the JDK is "with very high probability, the set of
> values collectively generated by the two objects has the same
> statistical properties as if the same quantity of values were generated
> by a single object". It is a bit of semantics.
>
> >
> > Anyways, I agree that SPLIT_MIX_64_K is low priority, but especially if we
> > cannot ensure that the added flexibility does not come with unsuspected
> > drawbacks (?).  [IIRC, the article about "TwoCmres" reported experiments for
> > choosing the (hard-coded) values that do produce good sequences.]
>
> The likelihood of overlap is 1/[the number of different Weyl sequences].
> This is around 2^-63. So it is small. However the base implementation
> for SplitMix uses a specific Weyl sequence increment based on the golden
> ratio. I do not know the original of this. For example is it better that
> any other irrational number converted to an integer, e.g. pi, Euler's
> number (e), sqrt(2)? The irrational number comes from the origin of the
> Weyl sequence which is based on irrationals. But there may be no reason
> that the golden ratio was chosen rather than any other source irrational.

Does the non-overlap weak assurance take into account the fact that
there are no irrational numbers in a truncated representation (like the
one stored in a computer)?
Isn't it why those "internal" numbers are carefully selected (like in
"TwoCmres"), because they generate 

Re: [rng] Split and Jump functions

2019-04-11 Thread Alex Herbert



On 11/04/2019 13:22, Gilles Sadowski wrote:

[...]

Not adding a dedicated method would mean everyone has to do this:

JumpableUniformRandomProvider rng = (JumpableUniformRandomProvider) 
RandomSource.create(…)

But adding a mirror methods:

JumpableUniformRandomProvider RandomSource::createJumpable(…)
LongJumpableUniformRandomProvider RandomSource::createLongJumpable(…)

Does seem to add clutter to RandomSource. If we leave it out for now it can be 
added in the future.

+1 (for leaving out).


Is there scope for needing the Jumpable to be detected through the API? E.g. 
add:

boolean RandomSource::isJumpable(RandomSource)

We would just need to maintain an EnumSet for Jumpable and likewise for 
LongJumpable. Again more clutter to the RandomSource interface but perhaps less 
so than a mirror create method that would throw IllegalArgumentException for 
any RandomSource specified that is not Jumpable.

+1 (for leaving out for now).


[...]

I was hoping to avoid creating a duplicate class. But actually that
would be fine and easier for testing. The implementation is trivial anyway.

Why duplicate?
IIUC, shouldn't the existing "core" class define an additional constructor
that accepts the "K" argument.  Then the current "SPLIT_MIX_64"
would use the default increment.
[Same as with "TWO_CMRES" and "TWO_CMRES_SELECT", no?.]

OK. That was where I should have gone to start with. I’ll do that.


I've just finished an initial implementation of the MSWS RNG that uses a
self-generated Weyl sequence. It works. So the idea for this can be
applied to SPLIT_MIX_64_K by using the same Weyl sequence generation in
both. Perhaps moving the code to create the Weyl sequence increment to
NumberFactory.

+1

OK. So I created a Jira ticket for the SPLIT_MIX_64_K. I’m not sure when I’ll 
get around to doing this though. It seems less important than other tasks. It 
may even be redundant if the only reason is to increase the state space for the 
generator. Each generator would still only have a period of 2^64. You can just 
create a lot of them with a weak assurance they will not overlap and that the 
uniformity is statistically the same. Since there are of the order of 2^63 
variants we are not going to be able to test this and would have to rely on the 
theory behind it.

If we add Jumpable to SplitMix with jumps of 2^32 and 2^48 then you can create 
either 2^32 rngs with no overlap for the first 2^32 output numbers or 2^16 rngs 
that can each be jumped 2^16 times with no overlap for the first 2^32 output 
numbers. That is a lot for a small parallel situation and does have the 
assurance of no overlap. Any parallel usage where longer sequences are expected 
to be used can use one of the XorShiRo generators.

I'm a bit lost here.  I thought that "SplitMix64" did not provide
"jump",


The state it is a linear sum. So can be jumped very easily.

y = mx + c

c is the current Weyl state, m the number of transitions and x the Weyl 
increment. So we can make SplitMix Jumpable. The maximum jump length is 
2^64 which will wrap the state.



hence the
"Weyl way" to ensure no-overlap, with high probability (why "weak assurance"?)
I used the term "weak assurance" as opposed to "concrete assurance". The 
wording used by the JDK is "with very high probability, the set of 
values collectively generated by the two objects has the same 
statistical properties as if the same quantity of values were generated 
by a single object". It is a bit of semantics.




Anyways, I agree that SPLIT_MIX_64_K is low priority, but especially if we
cannot ensure that the added flexibility does not come with unsuspected
drawbacks (?).  [IIRC, the article about "TwoCmres" reported experiments for
choosing the (hard-coded) values that do produce good sequences.]


The likelihood of overlap is 1/[the number of different Weyl sequences]. 
This is around 2^-63. So it is small. However the base implementation 
for SplitMix uses a specific Weyl sequence increment based on the golden 
ratio. I do not know the original of this. For example is it better that 
any other irrational number converted to an integer, e.g. pi, Euler's 
number (e), sqrt(2)? The irrational number comes from the origin of the 
Weyl sequence which is based on irrationals. But there may be no reason 
that the golden ratio was chosen rather than any other source irrational.



On the subject of splitting verses jumping:

Jumping seems to be a case where the limit of the multi-threaded 
scenario is known beforehand, e.g. a simulation with 1024 parallel 
tasks, each task with a known limit on the count of random numbers 
required. So if your jump length is above the number of random numbers 
required for each task you can jump to get a single generator for each 
task and ensure no overlap.


Split is used when the multi-threaded scenario is not known. This is a 
case for SplittableRandom which is used in parallel algorithms which use 
forking. The point at which a fork will occur and the task size of 

Re: [rng] Split and Jump functions

2019-04-11 Thread Gilles Sadowski
> > [...]
>
> Not adding a dedicated method would mean everyone has to do this:
>
> JumpableUniformRandomProvider rng = (JumpableUniformRandomProvider) 
> RandomSource.create(…)
>
> But adding a mirror methods:
>
> JumpableUniformRandomProvider RandomSource::createJumpable(…)
> LongJumpableUniformRandomProvider RandomSource::createLongJumpable(…)
>
> Does seem to add clutter to RandomSource. If we leave it out for now it can 
> be added in the future.

+1 (for leaving out).

> Is there scope for needing the Jumpable to be detected through the API? E.g. 
> add:
>
> boolean RandomSource::isJumpable(RandomSource)
>
> We would just need to maintain an EnumSet for Jumpable and likewise for 
> LongJumpable. Again more clutter to the RandomSource interface but perhaps 
> less so than a mirror create method that would throw IllegalArgumentException 
> for any RandomSource specified that is not Jumpable.

+1 (for leaving out for now).

>
> > [...]
> >>
> >> I was hoping to avoid creating a duplicate class. But actually that
> >> would be fine and easier for testing. The implementation is trivial anyway.
> >
> > Why duplicate?
> > IIUC, shouldn't the existing "core" class define an additional constructor
> > that accepts the "K" argument.  Then the current "SPLIT_MIX_64"
> > would use the default increment.
> > [Same as with "TWO_CMRES" and "TWO_CMRES_SELECT", no?.]
>
> OK. That was where I should have gone to start with. I’ll do that.
>
> >
> >> I've just finished an initial implementation of the MSWS RNG that uses a
> >> self-generated Weyl sequence. It works. So the idea for this can be
> >> applied to SPLIT_MIX_64_K by using the same Weyl sequence generation in
> >> both. Perhaps moving the code to create the Weyl sequence increment to
> >> NumberFactory.
> >
> > +1
>
> OK. So I created a Jira ticket for the SPLIT_MIX_64_K. I’m not sure when I’ll 
> get around to doing this though. It seems less important than other tasks. It 
> may even be redundant if the only reason is to increase the state space for 
> the generator. Each generator would still only have a period of 2^64. You can 
> just create a lot of them with a weak assurance they will not overlap and 
> that the uniformity is statistically the same. Since there are of the order 
> of 2^63 variants we are not going to be able to test this and would have to 
> rely on the theory behind it.
>
> If we add Jumpable to SplitMix with jumps of 2^32 and 2^48 then you can 
> create either 2^32 rngs with no overlap for the first 2^32 output numbers or 
> 2^16 rngs that can each be jumped 2^16 times with no overlap for the first 
> 2^32 output numbers. That is a lot for a small parallel situation and does 
> have the assurance of no overlap. Any parallel usage where longer sequences 
> are expected to be used can use one of the XorShiRo generators.

I'm a bit lost here.  I thought that "SplitMix64" did not provide
"jump", hence the
"Weyl way" to ensure no-overlap, with high probability (why "weak assurance"?)

Anyways, I agree that SPLIT_MIX_64_K is low priority, but especially if we
cannot ensure that the added flexibility does not come with unsuspected
drawbacks (?).  [IIRC, the article about "TwoCmres" reported experiments for
choosing the (hard-coded) values that do produce good sequences.]

Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [rng] Split and Jump functions

2019-04-10 Thread Alex Herbert



> On 10 Apr 2019, at 18:52, Gilles Sadowski  wrote:
> 
> Hi.
> 
>> 
>> On 10/04/2019 18:58, Gilles Sadowski wrote:
>>> Hi.
>>> 
 [... long quote skipped where I think we largely agree on the
 conclusions ...]
 So do we have a working idea to:
 
 - Add interface 'JumpableUniformRandomProvider'
>>> Do we need to add "createJumpable" factory methods in "RandomSource"
>>> methods or is there a way to avoid the duplication?
>>> 
>>> As mentioned in an earlier post, it would be cleaner/nicer (?) to add
>>> methods
>>> UniformRandomProvider jump();
>>> boolean isJumpable();
>>> to "UniformRandomProvider".
>>> This would require dropping support of Java 6 and 7 and perhaps a good
>>> reason to do so (?) ...
>> 
>> And move to V2.0 with Java 8 giving the opportunity to clean up other
>> deprecated stuff.
> 
> No!
> Changing major version version would entail package names change
> (thus forcing user codes updates)
> We don't need this since there isn't any breaking change if we add
> default methods.

My bad. It is obvious when justified like that.

> 
>> 
>> Would the the default implementation be to throw an
>> UnsupportedOperationException?
> 
> Yes.
>> 
>> I'm +0 on this.
>> 
>> I'm not against it but do think the UniformRandomProvider interface
>> could become quite cluttered with these extra methods that would be in
>> the minority (jump, longJump, isJumpable, isLongJumpable are not
>> generally available). It would also allow methods/classes that currently
>> use simple methods from UniformRandomProvider to have access to call
>> jump on the generator and spawn lots of sub generators. I think this is
>> bad. These methods should be written to explicitly require a Jumpable
>> instance.
> 
> From this POV, I certainly agree.
> 
>> My approach would have been to leave RandomSource as is and then state
>> that the returned generator can be tested to see if it is Jumpable using
>> instanceof. Someone who is writing code to use a Jumpable RNG should be
>> fine with that since they would have to know a priory what RandomSource
>> to create to get a Jumpable.
> 
> If it makes more sense, I'm fine letting the user write the "ugly" code. ;-)

Not adding a dedicated method would mean everyone has to do this:

JumpableUniformRandomProvider rng = (JumpableUniformRandomProvider) 
RandomSource.create(…)

But adding a mirror methods:

JumpableUniformRandomProvider RandomSource::createJumpable(…)
LongJumpableUniformRandomProvider RandomSource::createLongJumpable(…)

Does seem to add clutter to RandomSource. If we leave it out for now it can be 
added in the future.

Is there scope for needing the Jumpable to be detected through the API? E.g. 
add:

boolean RandomSource::isJumpable(RandomSource)

We would just need to maintain an EnumSet for Jumpable and likewise for 
LongJumpable. Again more clutter to the RandomSource interface but perhaps less 
so than a mirror create method that would throw IllegalArgumentException for 
any RandomSource specified that is not Jumpable.

> 
>> I would add a method to mimic RandomSource.unrestorable as
>> RandomSource.unjumpable. Or since they both would be doing the same
>> thing a new method RandomSource.restrict to 'restrict' functionality to
>> just the data generation methods in UniformRandomProvider. This restrict
>> method can be called by RandomSource.unrestorable and make that deprecated.
> 
> Looks neat.
> 
>>> 
 - Add interface 'LongJumpable... extends JumpableUniformRandomProvider'
>>> Same question...
>>> 
 - Test if a SplitMix variant with a self generated Weyl sequence can
 pass tests of uniformity. This would just require a seed of long[2], one
 for the state and one to use to derive the Weyl sequence increment.
>>> Is the new seed length a temporary workaround for the test,
>>> to be replaced with a new "SPLIT_MIX_64_K" provider, as
>>> mentioned in your previous message, if the test passes?
>>> 
>>> Gilles
>> 
>> I was hoping to avoid creating a duplicate class. But actually that
>> would be fine and easier for testing. The implementation is trivial anyway.
> 
> Why duplicate?
> IIUC, shouldn't the existing "core" class define an additional constructor
> that accepts the "K" argument.  Then the current "SPLIT_MIX_64"
> would use the default increment.
> [Same as with "TWO_CMRES" and "TWO_CMRES_SELECT", no?.]

OK. That was where I should have gone to start with. I’ll do that.

> 
>> I've just finished an initial implementation of the MSWS RNG that uses a
>> self-generated Weyl sequence. It works. So the idea for this can be
>> applied to SPLIT_MIX_64_K by using the same Weyl sequence generation in
>> both. Perhaps moving the code to create the Weyl sequence increment to
>> NumberFactory.
> 
> +1

OK. So I created a Jira ticket for the SPLIT_MIX_64_K. I’m not sure when I’ll 
get around to doing this though. It seems less important than other tasks. It 
may even be redundant if the only reason is to increase the state space for the 
gener

Re: Re: [rng] Split and Jump functions

2019-04-10 Thread Gilles Sadowski
Hi.

>
> On 10/04/2019 18:58, Gilles Sadowski wrote:
> > Hi.
> >
> >> [... long quote skipped where I think we largely agree on the
> >> conclusions ...]
> >> So do we have a working idea to:
> >>
> >> - Add interface 'JumpableUniformRandomProvider'
> > Do we need to add "createJumpable" factory methods in "RandomSource"
> > methods or is there a way to avoid the duplication?
> >
> > As mentioned in an earlier post, it would be cleaner/nicer (?) to add
> > methods
> > UniformRandomProvider jump();
> > boolean isJumpable();
> > to "UniformRandomProvider".
> > This would require dropping support of Java 6 and 7 and perhaps a good
> > reason to do so (?) ...
>
> And move to V2.0 with Java 8 giving the opportunity to clean up other
> deprecated stuff.

No!
Changing major version version would entail package names change
(thus forcing user codes updates)
We don't need this since there isn't any breaking change if we add
default methods.

>
> Would the the default implementation be to throw an
> UnsupportedOperationException?

Yes.
>
> I'm +0 on this.
>
> I'm not against it but do think the UniformRandomProvider interface
> could become quite cluttered with these extra methods that would be in
> the minority (jump, longJump, isJumpable, isLongJumpable are not
> generally available). It would also allow methods/classes that currently
> use simple methods from UniformRandomProvider to have access to call
> jump on the generator and spawn lots of sub generators. I think this is
> bad. These methods should be written to explicitly require a Jumpable
> instance.

>From this POV, I certainly agree.

> My approach would have been to leave RandomSource as is and then state
> that the returned generator can be tested to see if it is Jumpable using
> instanceof. Someone who is writing code to use a Jumpable RNG should be
> fine with that since they would have to know a priory what RandomSource
> to create to get a Jumpable.

If it makes more sense, I'm fine letting the user write the "ugly" code. ;-)

> I would add a method to mimic RandomSource.unrestorable as
> RandomSource.unjumpable. Or since they both would be doing the same
> thing a new method RandomSource.restrict to 'restrict' functionality to
> just the data generation methods in UniformRandomProvider. This restrict
> method can be called by RandomSource.unrestorable and make that deprecated.

Looks neat.

> >
> >> - Add interface 'LongJumpable... extends JumpableUniformRandomProvider'
> > Same question...
> >
> >> - Test if a SplitMix variant with a self generated Weyl sequence can
> >> pass tests of uniformity. This would just require a seed of long[2], one
> >> for the state and one to use to derive the Weyl sequence increment.
> > Is the new seed length a temporary workaround for the test,
> > to be replaced with a new "SPLIT_MIX_64_K" provider, as
> > mentioned in your previous message, if the test passes?
> >
> > Gilles
>
> I was hoping to avoid creating a duplicate class. But actually that
> would be fine and easier for testing. The implementation is trivial anyway.

Why duplicate?
IIUC, shouldn't the existing "core" class define an additional constructor
that accepts the "K" argument.  Then the current "SPLIT_MIX_64"
would use the default increment.
[Same as with "TWO_CMRES" and "TWO_CMRES_SELECT", no?.]

> I've just finished an initial implementation of the MSWS RNG that uses a
> self-generated Weyl sequence. It works. So the idea for this can be
> applied to SPLIT_MIX_64_K by using the same Weyl sequence generation in
> both. Perhaps moving the code to create the Weyl sequence increment to
> NumberFactory.

+1

Regards,
Gilles

> >
> >> Alex
> >>
> >>
> >>> Regards,
> >>> Gilles
> >>>
>  Alex
> 
> 
>  [1] https://en.wikipedia.org/wiki/Weyl_sequence
> 
>  [2] The Jira ticket RNG-85 had a note about the seeding algorithm for
>  the generator being GPL. There are alternatives based on the
>  SplittableRandom seeding method that could be used instead to
>  create the
>  increment for the Weyl sequence. I've speed tested the provided
>  algorithm and it is about 85x slower than the one used in
>  SplittableRandom. Since that algorithm has an issue with the unsigned
>  shift not being modelled by the Binomial distribution an updated
>  algorithm could be used that would be novel so avoid the Oracle or GPL
>  licences.
> 
> > Best,
> > Gilles
> >
>  Alex
> 
>  [1] https://github.com/aappleby/smhasher
> 
>  [2] Using Long.bitCount(long ^ (long >>> 1)) to count transitions
> 
>  [3] The SplitMix64 is a simple linear series and thus can be
>  jumped in
>  any power of 2 up to the maximum for a long (which causes sequence
>  wrapping). It can actually be jumped any number of iterations using
>  BigInteger arithmetic but jumping in powers of 2 can be implemented
>  using long arithmetic where th

Fwd: Re: [rng] Split and Jump functions

2019-04-10 Thread Alex Herbert

Resending to list.


 Forwarded Message 
Subject:Re: [rng] Split and Jump functions
Date:   Wed, 10 Apr 2019 18:00:54 +0100
From:   Alex Herbert 
To: Gilles Sadowski 



On 10/04/2019 18:58, Gilles Sadowski wrote:

Hi.

[... long quote skipped where I think we largely agree on the 
conclusions ...]

So do we have a working idea to:

- Add interface 'JumpableUniformRandomProvider'

Do we need to add "createJumpable" factory methods in "RandomSource"
methods or is there a way to avoid the duplication?

As mentioned in an earlier post, it would be cleaner/nicer (?) to add 
methods

UniformRandomProvider jump();
boolean isJumpable();
to "UniformRandomProvider".
This would require dropping support of Java 6 and 7 and perhaps a good
reason to do so (?) ...


And move to V2.0 with Java 8 giving the opportunity to clean up other 
deprecated stuff.


Would the the default implementation be to throw an 
UnsupportedOperationException?


I'm +0 on this.

I'm not against it but do think the UniformRandomProvider interface 
could become quite cluttered with these extra methods that would be in 
the minority (jump, longJump, isJumpable, isLongJumpable are not 
generally available). It would also allow methods/classes that currently 
use simple methods from UniformRandomProvider to have access to call 
jump on the generator and spawn lots of sub generators. I think this is 
bad. These methods should be written to explicitly require a Jumpable 
instance.


My approach would have been to leave RandomSource as is and then state 
that the returned generator can be tested to see if it is Jumpable using 
instanceof. Someone who is writing code to use a Jumpable RNG should be 
fine with that since they would have to know a priory what RandomSource 
to create to get a Jumpable.


I would add a method to mimic RandomSource.unrestorable as 
RandomSource.unjumpable. Or since they both would be doing the same 
thing a new method RandomSource.restrict to 'restrict' functionality to 
just the data generation methods in UniformRandomProvider. This restrict 
method can be called by RandomSource.unrestorable and make that deprecated.





- Add interface 'LongJumpable... extends JumpableUniformRandomProvider'

Same question...


- Test if a SplitMix variant with a self generated Weyl sequence can
pass tests of uniformity. This would just require a seed of long[2], one
for the state and one to use to derive the Weyl sequence increment.

Is the new seed length a temporary workaround for the test,
to be replaced with a new "SPLIT_MIX_64_K" provider, as
mentioned in your previous message, if the test passes?

Gilles


I was hoping to avoid creating a duplicate class. But actually that 
would be fine and easier for testing. The implementation is trivial anyway.


I've just finished an initial implementation of the MSWS RNG that uses a 
self-generated Weyl sequence. It works. So the idea for this can be 
applied to SPLIT_MIX_64_K by using the same Weyl sequence generation in 
both. Perhaps moving the code to create the Weyl sequence increment to 
NumberFactory.





Alex



Regards,
Gilles


Alex


[1] https://en.wikipedia.org/wiki/Weyl_sequence

[2] The Jira ticket RNG-85 had a note about the seeding algorithm for
the generator being GPL. There are alternatives based on the
SplittableRandom seeding method that could be used instead to 
create the

increment for the Weyl sequence. I've speed tested the provided
algorithm and it is about 85x slower than the one used in
SplittableRandom. Since that algorithm has an issue with the unsigned
shift not being modelled by the Binomial distribution an updated
algorithm could be used that would be novel so avoid the Oracle or GPL
licences.


Best,
Gilles


Alex

[1] https://github.com/aappleby/smhasher

[2] Using Long.bitCount(long ^ (long >>> 1)) to count transitions

[3] The SplitMix64 is a simple linear series and thus can be 
jumped in

any power of 2 up to the maximum for a long (which causes sequence
wrapping). It can actually be jumped any number of iterations using
BigInteger arithmetic but jumping in powers of 2 can be implemented
using long arithmetic where the rollover bits beyond 64 are 
naturally

discarded:

long jumps = 1234567;

long increment = 0x9e3779b97f4a7c15L;

state = BigInteger.valueOf(state)

.add(BigInteger.valueOf(increment).multiply(BigInteger.valueOf(jumps)))

.longValue(); // narrowing primitive conversion

int jumpPower = 32;

state = BigInteger.valueOf(state)

.add(BigInteger.valueOf(increment).shiftLeft(jumpPower))

.longValue(); // narrowing primitive conversion

// Same as above by discarding overflow bits

state = state + (increment << jumpPower);

This is based on my understanding of BigInteger and signed/unsigned
arithmetic and should be verified in tests.

[4] Given the period of the SplitMix is 2^64, and the period 
may be les

Re: [rng] Split and Jump functions

2019-04-10 Thread Gilles Sadowski
Hi.

> [... long quote skipped where I think we largely agree on the conclusions ...]

> So do we have a working idea to:
>
> - Add interface 'JumpableUniformRandomProvider'

Do we need to add "createJumpable" factory methods in "RandomSource"
methods or is there a way to avoid the duplication?

As mentioned in an earlier post, it would be cleaner/nicer (?) to add methods
UniformRandomProvider jump();
boolean isJumpable();
 to "UniformRandomProvider".
This would require dropping support of Java 6 and 7 and perhaps a good
reason to do so (?) ...

>
> - Add interface 'LongJumpable... extends JumpableUniformRandomProvider'

Same question...

> - Test if a SplitMix variant with a self generated Weyl sequence can
> pass tests of uniformity. This would just require a seed of long[2], one
> for the state and one to use to derive the Weyl sequence increment.

Is the new seed length a temporary workaround for the test,
to be replaced with a new "SPLIT_MIX_64_K" provider, as
mentioned in your previous message, if the test passes?

Gilles

>
> Alex
>
>
> >
> > Regards,
> > Gilles
> >
> >> Alex
> >>
> >>
> >> [1] https://en.wikipedia.org/wiki/Weyl_sequence
> >>
> >> [2] The Jira ticket RNG-85 had a note about the seeding algorithm for
> >> the generator being GPL. There are alternatives based on the
> >> SplittableRandom seeding method that could be used instead to create the
> >> increment for the Weyl sequence. I've speed tested the provided
> >> algorithm and it is about 85x slower than the one used in
> >> SplittableRandom. Since that algorithm has an issue with the unsigned
> >> shift not being modelled by the Binomial distribution an updated
> >> algorithm could be used that would be novel so avoid the Oracle or GPL
> >> licences.
> >>
> >>> Best,
> >>> Gilles
> >>>
> >> Alex
> >>
> >> [1] https://github.com/aappleby/smhasher
> >>
> >> [2] Using Long.bitCount(long ^ (long >>> 1)) to count transitions
> >>
> >> [3] The SplitMix64 is a simple linear series and thus can be jumped in
> >> any power of 2 up to the maximum for a long (which causes sequence
> >> wrapping). It can actually be jumped any number of iterations using
> >> BigInteger arithmetic but jumping in powers of 2 can be implemented
> >> using long arithmetic where the rollover bits beyond 64 are naturally
> >> discarded:
> >>
> >> long jumps = 1234567;
> >>
> >> long increment = 0x9e3779b97f4a7c15L;
> >>
> >> state = BigInteger.valueOf(state)
> >>
> >>   
> >> .add(BigInteger.valueOf(increment).multiply(BigInteger.valueOf(jumps)))
> >>
> >>   .longValue(); // narrowing primitive conversion
> >>
> >> int jumpPower = 32;
> >>
> >> state = BigInteger.valueOf(state)
> >>
> >>   
> >> .add(BigInteger.valueOf(increment).shiftLeft(jumpPower))
> >>
> >>   .longValue(); // narrowing primitive conversion
> >>
> >> // Same as above by discarding overflow bits
> >>
> >> state = state + (increment << jumpPower);
> >>
> >> This is based on my understanding of BigInteger and signed/unsigned
> >> arithmetic and should be verified in tests.
> >>
> >> [4] Given the period of the SplitMix is 2^64, and the period may be 
> >> less
> >> than this in practice it may be better to only support jumps of e.g.
> >> 2^32 in a manner similar to those provided for the XorShiRo generators
> >> where the state can be advanced a factor of the period, or just not
> >> supports jumps. I can see the utility in jumping more than
> >> Integer.MAX_VALUE (guaranteed unique outputs for the maximum array 
> >> size)
> >> but less than 2^32 is tending towards not very many random numbers from
> >> the original instance before sequence overlap with the jumped instance.
> >>
> >>
> >>> -
> >>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >>> For additional commands, e-mail: dev-h...@commons.apache.org
> >>>
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [rng] Split and Jump functions

2019-04-09 Thread Alex Herbert



On 09/04/2019 17:36, Gilles Sadowski wrote:

Hello.

Le mar. 9 avr. 2019 à 16:42, Alex Herbert  a écrit :

Hi Gilles,

Lots of sensible discussion here. I'll just put all replies at the end.

On 09/04/2019 13:28, Gilles Sadowski wrote:

Hello.


Hi Gilles,

You ask some good questions which I may have been vague about due to
familiarity with the possibilities. I hope to clarify a bit below.

On 08/04/2019 16:05, Gilles Sadowski wrote:

Hi Alex.

Le lun. 8 avr. 2019 à 14:36, Alex Herbert  a écrit :

This is a starter for a discussion on the split and jump functionality
for a random generator.

Split:

To create a new instance of the generator that is deterministically
based on the state of the current instance but the probability that the
sequence generated by the new instance and the current instance overlap
is negligible.

I may well be mistaken but I seem to recall that a split is supposed
to create an instance with no overlap for a sequence below a certain
length.

   From the implementations I have found in the XorShiRo family they have
both a split and a jump.

The C implementations do not contain a "split" function, although the
Java ones do.
Is it only to preserve familiarity with the "SplittableRandom" API?
What are use-cases that would not be (better) covered by jump?


The split basically creates a new random generator. There are no
guarantees about sequence overlap. This is like seeding a new instance.
The difference is that it is deterministic based on the current state
and will return an instance of the same generator, that will be
different, and do it fast. It is very simple to do this. Just scramble
the current state using an algorithm different from how the state is
regularly updated. I see it as a "scrambled copy" type functionality.

Is the only purpose to get a second instance faster than by
going through the "RandomSource" factory?  [If so, how many
creations would be needed in order to see a noticeable difference?]
Or there is a inherent interest of generating a instance using
another's state a seed?  [I'd tend to think there isn't due to the lack
of this functionality in C codes.]


The jump is more constrained. It advances the generator to a point that
would be reached after a large number of calls to next(). Here's how the
documentation from XorShiro256StarStar describes its use (c-code):

/* This is the jump function for the generator. It is equivalent
  to 2^128 calls to next(); it can be used to generate 2^128
  non-overlapping subsequences for parallel computations. */

void jump(void);

Indeed, the "non-overlapping" guarantee is a added feature.

In comparison, "split" looks pretty bland (as just another way
to instantiate a generator).


/* This is the long-jump function for the generator. It is equivalent to
  2^192 calls to next(); it can be used to generate 2^64 starting points,
  from each of which jump() will generate 2^64 non-overlapping
  subsequences for parallel distributed computations. */

void long_jump(void);

So the idea is to seed an experiment once to get a single generator.
Then jump it for each parallel computation. Each computation will then
be guaranteed to run with a different sequence for at least as long as
the jump length.

This results in the following type of code using the API I suggested:

// If jump returns a new instance
JumpableUniformRandomProvider source = ...;
UniformRandomProvider[] rngs = new UniformRandomProvider[128];
for (int i = 0; i < rngs.length; i++) {
   // Advance state
   rngs[i] = source.jump();
   source = rngs[i];
}

In my suggested API the jump returns a new instance. So calling jump
repeatedly on the same generator keeps returning the same fast-forward
state.

The least surprise behaviour.


This could lead to errors if not well documented how to use it.

To be documented, but it would be the same kind of errors as
forgetting to increment a counter.


The alternative is to advance the state of the same instance. So to get
the same effect for parallel computations you must have an ability to
copy a generator (which we do not have in the API).

// If jump updates the current instance
JumpableUniformRandomProvider source = ...;
UniformRandomProvider[] rngs = new UniformRandomProvider[128];
for (int i = 0; i < rngs.length; i++) {
   // Advance state and copy
   rngs[i] = source.jump().copy();
}

We previously discussed copy(). IIRC the conclusion was that copy()
could be misused for parallel computations; it basically allows a
parallel set of work all with a copy of the same generator and so
limited randomness across the set.

Not sure what you mean by "limited randomness".


Leaving it out of the API out forces
the use of new generators for parallel work.

So there are at least three options for jump:

1. jump returns a new instance with an advanced state. Current state is
the same (possible errors when using this repeatedly)

2. jump advances the current state (requires separate copy me

Re: [rng] Split and Jump functions

2019-04-09 Thread Gilles Sadowski
Hello.

Le mar. 9 avr. 2019 à 16:42, Alex Herbert  a écrit :
>
> Hi Gilles,
>
> Lots of sensible discussion here. I'll just put all replies at the end.
>
> On 09/04/2019 13:28, Gilles Sadowski wrote:
> > Hello.
> >
> >> Hi Gilles,
> >>
> >> You ask some good questions which I may have been vague about due to
> >> familiarity with the possibilities. I hope to clarify a bit below.
> >>
> >> On 08/04/2019 16:05, Gilles Sadowski wrote:
> >>> Hi Alex.
> >>>
> >>> Le lun. 8 avr. 2019 à 14:36, Alex Herbert  a 
> >>> écrit :
>  This is a starter for a discussion on the split and jump functionality
>  for a random generator.
> 
>  Split:
> 
>  To create a new instance of the generator that is deterministically
>  based on the state of the current instance but the probability that the
>  sequence generated by the new instance and the current instance overlap
>  is negligible.
> >>> I may well be mistaken but I seem to recall that a split is supposed
> >>> to create an instance with no overlap for a sequence below a certain
> >>> length.
> >>   From the implementations I have found in the XorShiRo family they have
> >> both a split and a jump.
> > The C implementations do not contain a "split" function, although the
> > Java ones do.
> > Is it only to preserve familiarity with the "SplittableRandom" API?
> > What are use-cases that would not be (better) covered by jump?
> >
> >> The split basically creates a new random generator. There are no
> >> guarantees about sequence overlap. This is like seeding a new instance.
> >> The difference is that it is deterministic based on the current state
> >> and will return an instance of the same generator, that will be
> >> different, and do it fast. It is very simple to do this. Just scramble
> >> the current state using an algorithm different from how the state is
> >> regularly updated. I see it as a "scrambled copy" type functionality.
> > Is the only purpose to get a second instance faster than by
> > going through the "RandomSource" factory?  [If so, how many
> > creations would be needed in order to see a noticeable difference?]
> > Or there is a inherent interest of generating a instance using
> > another's state a seed?  [I'd tend to think there isn't due to the lack
> > of this functionality in C codes.]
> >
> >> The jump is more constrained. It advances the generator to a point that
> >> would be reached after a large number of calls to next(). Here's how the
> >> documentation from XorShiro256StarStar describes its use (c-code):
> >>
> >> /* This is the jump function for the generator. It is equivalent
> >>  to 2^128 calls to next(); it can be used to generate 2^128
> >>  non-overlapping subsequences for parallel computations. */
> >>
> >> void jump(void);
> > Indeed, the "non-overlapping" guarantee is a added feature.
> >
> > In comparison, "split" looks pretty bland (as just another way
> > to instantiate a generator).
> >
> >> /* This is the long-jump function for the generator. It is equivalent to
> >>  2^192 calls to next(); it can be used to generate 2^64 starting 
> >> points,
> >>  from each of which jump() will generate 2^64 non-overlapping
> >>  subsequences for parallel distributed computations. */
> >>
> >> void long_jump(void);
> >>
> >> So the idea is to seed an experiment once to get a single generator.
> >> Then jump it for each parallel computation. Each computation will then
> >> be guaranteed to run with a different sequence for at least as long as
> >> the jump length.
> >>
> >> This results in the following type of code using the API I suggested:
> >>
> >> // If jump returns a new instance
> >> JumpableUniformRandomProvider source = ...;
> >> UniformRandomProvider[] rngs = new UniformRandomProvider[128];
> >> for (int i = 0; i < rngs.length; i++) {
> >>   // Advance state
> >>   rngs[i] = source.jump();
> >>   source = rngs[i];
> >> }
> >>
> >> In my suggested API the jump returns a new instance. So calling jump
> >> repeatedly on the same generator keeps returning the same fast-forward
> >> state.
> > The least surprise behaviour.
> >
> >> This could lead to errors if not well documented how to use it.
> > To be documented, but it would be the same kind of errors as
> > forgetting to increment a counter.
> >
> >> The alternative is to advance the state of the same instance. So to get
> >> the same effect for parallel computations you must have an ability to
> >> copy a generator (which we do not have in the API).
> >>
> >> // If jump updates the current instance
> >> JumpableUniformRandomProvider source = ...;
> >> UniformRandomProvider[] rngs = new UniformRandomProvider[128];
> >> for (int i = 0; i < rngs.length; i++) {
> >>   // Advance state and copy
> >>   rngs[i] = source.jump().copy();
> >> }
> >>
> >> We previously discussed copy(). IIRC the conclusion was that copy()
> >> could be misused for parallel computations; it basically allows a
> >> parallel set of wor

Re: [rng] Split and Jump functions

2019-04-09 Thread Alex Herbert

Hi Gilles,

Lots of sensible discussion here. I'll just put all replies at the end.

On 09/04/2019 13:28, Gilles Sadowski wrote:

Hello.


Hi Gilles,

You ask some good questions which I may have been vague about due to
familiarity with the possibilities. I hope to clarify a bit below.

On 08/04/2019 16:05, Gilles Sadowski wrote:

Hi Alex.

Le lun. 8 avr. 2019 à 14:36, Alex Herbert  a écrit :

This is a starter for a discussion on the split and jump functionality
for a random generator.

Split:

To create a new instance of the generator that is deterministically
based on the state of the current instance but the probability that the
sequence generated by the new instance and the current instance overlap
is negligible.

I may well be mistaken but I seem to recall that a split is supposed
to create an instance with no overlap for a sequence below a certain
length.

  From the implementations I have found in the XorShiRo family they have
both a split and a jump.

The C implementations do not contain a "split" function, although the
Java ones do.
Is it only to preserve familiarity with the "SplittableRandom" API?
What are use-cases that would not be (better) covered by jump?


The split basically creates a new random generator. There are no
guarantees about sequence overlap. This is like seeding a new instance.
The difference is that it is deterministic based on the current state
and will return an instance of the same generator, that will be
different, and do it fast. It is very simple to do this. Just scramble
the current state using an algorithm different from how the state is
regularly updated. I see it as a "scrambled copy" type functionality.

Is the only purpose to get a second instance faster than by
going through the "RandomSource" factory?  [If so, how many
creations would be needed in order to see a noticeable difference?]
Or there is a inherent interest of generating a instance using
another's state a seed?  [I'd tend to think there isn't due to the lack
of this functionality in C codes.]


The jump is more constrained. It advances the generator to a point that
would be reached after a large number of calls to next(). Here's how the
documentation from XorShiro256StarStar describes its use (c-code):

/* This is the jump function for the generator. It is equivalent
 to 2^128 calls to next(); it can be used to generate 2^128
 non-overlapping subsequences for parallel computations. */

void jump(void);

Indeed, the "non-overlapping" guarantee is a added feature.

In comparison, "split" looks pretty bland (as just another way
to instantiate a generator).


/* This is the long-jump function for the generator. It is equivalent to
 2^192 calls to next(); it can be used to generate 2^64 starting points,
 from each of which jump() will generate 2^64 non-overlapping
 subsequences for parallel distributed computations. */

void long_jump(void);

So the idea is to seed an experiment once to get a single generator.
Then jump it for each parallel computation. Each computation will then
be guaranteed to run with a different sequence for at least as long as
the jump length.

This results in the following type of code using the API I suggested:

// If jump returns a new instance
JumpableUniformRandomProvider source = ...;
UniformRandomProvider[] rngs = new UniformRandomProvider[128];
for (int i = 0; i < rngs.length; i++) {
  // Advance state
  rngs[i] = source.jump();
  source = rngs[i];
}

In my suggested API the jump returns a new instance. So calling jump
repeatedly on the same generator keeps returning the same fast-forward
state.

The least surprise behaviour.


This could lead to errors if not well documented how to use it.

To be documented, but it would be the same kind of errors as
forgetting to increment a counter.


The alternative is to advance the state of the same instance. So to get
the same effect for parallel computations you must have an ability to
copy a generator (which we do not have in the API).

// If jump updates the current instance
JumpableUniformRandomProvider source = ...;
UniformRandomProvider[] rngs = new UniformRandomProvider[128];
for (int i = 0; i < rngs.length; i++) {
  // Advance state and copy
  rngs[i] = source.jump().copy();
}

We previously discussed copy(). IIRC the conclusion was that copy()
could be misused for parallel computations; it basically allows a
parallel set of work all with a copy of the same generator and so
limited randomness across the set.

Not sure what you mean by "limited randomness".


Leaving it out of the API out forces
the use of new generators for parallel work.

So there are at least three options for jump:

1. jump returns a new instance with an advanced state. Current state is
the same (possible errors when using this repeatedly)

2. jump advances the current state (requires separate copy method to be
of use)

3. jump advances the current state and returns a new copy instance
(combined jump and copy is possib

Re: [rng] Split and Jump functions

2019-04-09 Thread Gilles Sadowski
Hello.

>
> Hi Gilles,
>
> You ask some good questions which I may have been vague about due to
> familiarity with the possibilities. I hope to clarify a bit below.
>
> On 08/04/2019 16:05, Gilles Sadowski wrote:
> > Hi Alex.
> >
> > Le lun. 8 avr. 2019 à 14:36, Alex Herbert  a 
> > écrit :
> >> This is a starter for a discussion on the split and jump functionality
> >> for a random generator.
> >>
> >> Split:
> >>
> >> To create a new instance of the generator that is deterministically
> >> based on the state of the current instance but the probability that the
> >> sequence generated by the new instance and the current instance overlap
> >> is negligible.
> > I may well be mistaken but I seem to recall that a split is supposed
> > to create an instance with no overlap for a sequence below a certain
> > length.
>
>  From the implementations I have found in the XorShiRo family they have
> both a split and a jump.

The C implementations do not contain a "split" function, although the
Java ones do.
Is it only to preserve familiarity with the "SplittableRandom" API?
What are use-cases that would not be (better) covered by jump?

>
> The split basically creates a new random generator. There are no
> guarantees about sequence overlap. This is like seeding a new instance.
> The difference is that it is deterministic based on the current state
> and will return an instance of the same generator, that will be
> different, and do it fast. It is very simple to do this. Just scramble
> the current state using an algorithm different from how the state is
> regularly updated. I see it as a "scrambled copy" type functionality.

Is the only purpose to get a second instance faster than by
going through the "RandomSource" factory?  [If so, how many
creations would be needed in order to see a noticeable difference?]
Or there is a inherent interest of generating a instance using
another's state a seed?  [I'd tend to think there isn't due to the lack
of this functionality in C codes.]

> The jump is more constrained. It advances the generator to a point that
> would be reached after a large number of calls to next(). Here's how the
> documentation from XorShiro256StarStar describes its use (c-code):
>
> /* This is the jump function for the generator. It is equivalent
> to 2^128 calls to next(); it can be used to generate 2^128
> non-overlapping subsequences for parallel computations. */
>
> void jump(void);

Indeed, the "non-overlapping" guarantee is a added feature.

In comparison, "split" looks pretty bland (as just another way
to instantiate a generator).

>
> /* This is the long-jump function for the generator. It is equivalent to
> 2^192 calls to next(); it can be used to generate 2^64 starting points,
> from each of which jump() will generate 2^64 non-overlapping
> subsequences for parallel distributed computations. */
>
> void long_jump(void);
>
> So the idea is to seed an experiment once to get a single generator.
> Then jump it for each parallel computation. Each computation will then
> be guaranteed to run with a different sequence for at least as long as
> the jump length.
>
> This results in the following type of code using the API I suggested:
>
> // If jump returns a new instance
> JumpableUniformRandomProvider source = ...;
> UniformRandomProvider[] rngs = new UniformRandomProvider[128];
> for (int i = 0; i < rngs.length; i++) {
>  // Advance state
>  rngs[i] = source.jump();
>  source = rngs[i];
> }
>
> In my suggested API the jump returns a new instance. So calling jump
> repeatedly on the same generator keeps returning the same fast-forward
> state.

The least surprise behaviour.

> This could lead to errors if not well documented how to use it.

To be documented, but it would be the same kind of errors as
forgetting to increment a counter.

> The alternative is to advance the state of the same instance. So to get
> the same effect for parallel computations you must have an ability to
> copy a generator (which we do not have in the API).
>
> // If jump updates the current instance
> JumpableUniformRandomProvider source = ...;
> UniformRandomProvider[] rngs = new UniformRandomProvider[128];
> for (int i = 0; i < rngs.length; i++) {
>  // Advance state and copy
>  rngs[i] = source.jump().copy();
> }
>
> We previously discussed copy(). IIRC the conclusion was that copy()
> could be misused for parallel computations; it basically allows a
> parallel set of work all with a copy of the same generator and so
> limited randomness across the set.

Not sure what you mean by "limited randomness".

> Leaving it out of the API out forces
> the use of new generators for parallel work.
>
> So there are at least three options for jump:
>
> 1. jump returns a new instance with an advanced state. Current state is
> the same (possible errors when using this repeatedly)
>
> 2. jump advances the current state (requires separate copy method to be
> of use)
>
> 3. jump advances the current st

Re: [rng] Split and Jump functions

2019-04-08 Thread Alex Herbert

Hi Gilles,

You ask some good questions which I may have been vague about due to 
familiarity with the possibilities. I hope to clarify a bit below.


On 08/04/2019 16:05, Gilles Sadowski wrote:

Hi Alex.

Le lun. 8 avr. 2019 à 14:36, Alex Herbert  a écrit :

This is a starter for a discussion on the split and jump functionality
for a random generator.

Split:

To create a new instance of the generator that is deterministically
based on the state of the current instance but the probability that the
sequence generated by the new instance and the current instance overlap
is negligible.

I may well be mistaken but I seem to recall that a split is supposed
to create an instance with no overlap for a sequence below a certain
length.


From the implementations I have found in the XorShiRo family they have 
both a split and a jump.


The split basically creates a new random generator. There are no 
guarantees about sequence overlap. This is like seeding a new instance. 
The difference is that it is deterministic based on the current state 
and will return an instance of the same generator, that will be 
different, and do it fast. It is very simple to do this. Just scramble 
the current state using an algorithm different from how the state is 
regularly updated. I see it as a "scrambled copy" type functionality.


The jump is more constrained. It advances the generator to a point that 
would be reached after a large number of calls to next(). Here's how the 
documentation from XorShiro256StarStar describes its use (c-code):


/* This is the jump function for the generator. It is equivalent
   to 2^128 calls to next(); it can be used to generate 2^128
   non-overlapping subsequences for parallel computations. */

void jump(void);

/* This is the long-jump function for the generator. It is equivalent to
   2^192 calls to next(); it can be used to generate 2^64 starting points,
   from each of which jump() will generate 2^64 non-overlapping
   subsequences for parallel distributed computations. */

void long_jump(void);

So the idea is to seed an experiment once to get a single generator. 
Then jump it for each parallel computation. Each computation will then 
be guaranteed to run with a different sequence for at least as long as 
the jump length.


This results in the following type of code using the API I suggested:

// If jump returns a new instance
JumpableUniformRandomProvider source = ...;
UniformRandomProvider[] rngs = new UniformRandomProvider[128];
for (int i = 0; i < rngs.length; i++) {
    // Advance state
    rngs[i] = source.jump();
    source = rngs[i];
}

In my suggested API the jump returns a new instance. So calling jump 
repeatedly on the same generator keeps returning the same fast-forward 
state. This could lead to errors if not well documented how to use it. 
The alternative is to advance the state of the same instance. So to get 
the same effect for parallel computations you must have an ability to 
copy a generator (which we do not have in the API).


// If jump updates the current instance
JumpableUniformRandomProvider source = ...;
UniformRandomProvider[] rngs = new UniformRandomProvider[128];
for (int i = 0; i < rngs.length; i++) {
    // Advance state and copy
    rngs[i] = source.jump().copy();
}

We previously discussed copy(). IIRC the conclusion was that copy() 
could be misused for parallel computations; it basically allows a 
parallel set of work all with a copy of the same generator and so 
limited randomness across the set. Leaving it out of the API out forces 
the use of new generators for parallel work.


So there are at least three options for jump:

1. jump returns a new instance with an advanced state. Current state is 
the same (possible errors when using this repeatedly)


2. jump advances the current state (requires separate copy method to be 
of use)


3. jump advances the current state and returns a new copy instance 
(combined jump and copy is possibly a bit confusing)


I would opt maybe instead for option 2 over the original option 1. But 
then that exposes the 'risks' of misuse of just copy().



Jump:

To create a new instance of the generator that is deterministically
based on the state of the current instance but advanced a set number of
iterations.

IOW, after a jump, the new state is equivalent to the state one would
have gotten to after a specified (large) number of calls to "next()".

Yes.



Split:

This is an alternative to simply creating a new instance of the same
generator using a different seed,

It's not my (limited/wrong ?) understanding.  What prevents the new
seed from creating a state that would be reached (by the original
instance) after just a few numbers?
Nothing. There are no guarantees with split. However it should be noted 
that for the generators with a large state space the probability of 
collision will be low. Plus it would be no worse than creating a new 
instance anyway, just faster.



or using the generator to create a
seed then creatin

Re: [rng] Split and Jump functions

2019-04-08 Thread Gilles Sadowski
Hi Alex.

Le lun. 8 avr. 2019 à 14:36, Alex Herbert  a écrit :
>
> This is a starter for a discussion on the split and jump functionality
> for a random generator.
>
> Split:
>
> To create a new instance of the generator that is deterministically
> based on the state of the current instance but the probability that the
> sequence generated by the new instance and the current instance overlap
> is negligible.

I may well be mistaken but I seem to recall that a split is supposed
to create an instance with no overlap for a sequence below a certain
length.

>
> Jump:
>
> To create a new instance of the generator that is deterministically
> based on the state of the current instance but advanced a set number of
> iterations.

IOW, after a jump, the new state is equivalent to the state one would
have gotten to after a specified (large) number of calls to "next()".

>
> Split:
>
> This is an alternative to simply creating a new instance of the same
> generator using a different seed,

It's not my (limited/wrong ?) understanding.  What prevents the new
seed from creating a state that would be reached (by the original
instance) after just a few numbers?

> or using the generator to create a
> seed then creating a new instance.

What is gained from using the current instance, rather than any other
generator to create a new seed (like it's done with the "SeedFactory")?

> In the later case this would advance
> the current generator but a split does not have to (more on this later).
>
> Assuming the state of a generator is a set of bits with no further
> requirements then a split can be performed by a bit mixing algorithm. A
> commonly used algorithm is the finalisation step of the MurmurHash3 of
> Austin Appleby [1] which has variants to mix int and long values from
> input int or long.
>
> Most of the algorithms in Commons RNG should have a state applicable to
> simple mixing.

Subject to my very partial knowledge, I assumed that "jump" was
the engine behind the "split" functionality (whose purpose is to
ensure a specified number of non-overlapping sequences).

> Notable exceptions are those that implement a controlled
> seeding procedure: MersenneTwister; MersenneTwister64; ISAAC; and
> TwoCmres. These have a state that is more controlled that just random
> bits. In most cases the split could be implemented by mixing the state
> to create a seed for a new instance, then running through the
> initialisation procedure. There may not be much value in this (over just
> forcing a user to create a new seeded instance) although making all the
> generators Splittable makes an easier to use library.

I thought that the question was whether a "jump" function always exists.
IOW, for some generators, the only way to get from one state to another
would be by passing through all the intermediate states.

> A notable exception is the SplitMix64 algorithm. This implements
> generation of long values identically to Java 8's SplittableRandom. The
> sequence is generated using a linear series of long values that is mixed
> to output random long values. The linear series is produced using a
> default increment that maximises the period of the sequence. A simple
> split implementation would just mix the current point in the series.
> However Java 8's SplittableRandom also mixes the increment constant
> (using a MurmurHash3 mix step). The constant is then forced to be odd
> and contain a high number of bit state transitions (i.e. 0s to 1s or 1s
> to 0s, the maximum for a long is 32) [2]. So should the Commons RNG
> class match the implementation of SplittableRandom? This may have
> licensing issues. Note: A side-effect of requiring two numbers for the
> split instance is the current instance is advanced.
>
> Given there are no decisions to be made by the user when splitting a
> suggestion for an API would be a new interface akin to
> RestorableUniformRandomProvider:
>
> package org.apache.commons.rng;
>
> public interface SplittableUniformRandomProvider extends
> UniformRandomProvider {
>
>  SplittableUniformRandomProvider split();
>
> }
>
>
> An implementation detail:
>
> Q. Should a split advance the state of the current generator? This
> allows successive split() calls from the same instance to create new
> variants.

I cannot give a pertinent opinion given my questions above...

IIUC what you wrote earlier (i.e. "split" is akin to create a new
instance from a different seed), it does not need to be defined
"internally".  And, actually, calling "RandomSource.create", without
specifying the seed, is indeed doing a "split" (?).]

If I'm correct (about "jump" ensuring non-overlapping sequences),
then "rng.split()" would just be equivalent to "rng.jump()".

> I would vote yes. It will prevent misuse where multiple threads are
> created that each have an instance provided by a call to split() on the
> same source generator that is not otherwise advanced.
>
> A user can always save the state of any of the generators in the library
> an

[rng] Split and Jump functions

2019-04-08 Thread Alex Herbert
This is a starter for a discussion on the split and jump functionality 
for a random generator.


Split:

To create a new instance of the generator that is deterministically 
based on the state of the current instance but the probability that the 
sequence generated by the new instance and the current instance overlap 
is negligible.


Jump:

To create a new instance of the generator that is deterministically 
based on the state of the current instance but advanced a set number of 
iterations.



Split:

This is an alternative to simply creating a new instance of the same 
generator using a different seed, or using the generator to create a 
seed then creating a new instance. In the later case this would advance 
the current generator but a split does not have to (more on this later).


Assuming the state of a generator is a set of bits with no further 
requirements then a split can be performed by a bit mixing algorithm. A 
commonly used algorithm is the finalisation step of the MurmurHash3 of 
Austin Appleby [1] which has variants to mix int and long values from 
input int or long.


Most of the algorithms in Commons RNG should have a state applicable to 
simple mixing. Notable exceptions are those that implement a controlled 
seeding procedure: MersenneTwister; MersenneTwister64; ISAAC; and 
TwoCmres. These have a state that is more controlled that just random 
bits. In most cases the split could be implemented by mixing the state 
to create a seed for a new instance, then running through the 
initialisation procedure. There may not be much value in this (over just 
forcing a user to create a new seeded instance) although making all the 
generators Splittable makes an easier to use library.


A notable exception is the SplitMix64 algorithm. This implements 
generation of long values identically to Java 8's SplittableRandom. The 
sequence is generated using a linear series of long values that is mixed 
to output random long values. The linear series is produced using a 
default increment that maximises the period of the sequence. A simple 
split implementation would just mix the current point in the series. 
However Java 8's SplittableRandom also mixes the increment constant 
(using a MurmurHash3 mix step). The constant is then forced to be odd 
and contain a high number of bit state transitions (i.e. 0s to 1s or 1s 
to 0s, the maximum for a long is 32) [2]. So should the Commons RNG 
class match the implementation of SplittableRandom? This may have 
licensing issues. Note: A side-effect of requiring two numbers for the 
split instance is the current instance is advanced.


Given there are no decisions to be made by the user when splitting a 
suggestion for an API would be a new interface akin to 
RestorableUniformRandomProvider:


package org.apache.commons.rng;

public interface SplittableUniformRandomProvider extends 
UniformRandomProvider {


    SplittableUniformRandomProvider split();

}


An implementation detail:

Q. Should a split advance the state of the current generator? This 
allows successive split() calls from the same instance to create new 
variants.


I would vote yes. It will prevent misuse where multiple threads are 
created that each have an instance provided by a call to split() on the 
same source generator that is not otherwise advanced.


A user can always save the state of any of the generators in the library 
and restore it to the state before the call to split if they so desire.



Jump:

This requires a generator that has a predictable state at some point in 
the future given the current state. The XorShiRo family have jump 
functions computed by the original authors. The jump size in power of 2 
is provided. The SplitMix is a simple linear series and so can be jumped 
any distance. Jumps functions are available for:


XOR_SHIFT_1024_S & XOR_SHIFT_1024_S_PHI (512)
XO_SHI_RO_128_PLUS & XO_SHI_RO_128_SS (64)
XO_RO_SHI_RO_128_PLUS & XO_RO_SHI_RO_128_SS (64, 96)
XO_SHI_RO_256_PLUS & XO_SHI_RO_256_SS (128, 192)
XO_SHI_RO_512_PLUS & XO_SHI_RO_512_SS (256)
SPLIT_MIX_64 (1 - 63) [3]

*** The list may not be complete as I have not checkout the original 
source for all generators. ***


The fact that the jump size can be different requires a more flexible 
API. Currently all the jumps are powers of 2. But this may not always be 
true. This allows the following specification to indicate the number of 
jumps that are supported:


package org.apache.commons.rng;

public interface JumpableUniformRandomProvider extends 
UniformRandomProvider {


    int getNumberOfJumps();

    JumpableUniformRandomProvider jump(int jump);

}

The implementation would then support any jump parameter up to the 
supported number of jumps.


The definition of the meaning of the jump parameter is ambiguous. It 
should be documented in the implementing class what the 'jump' refers to.


This would allow for example XO_SHI_RO_256_PLUS to return 2 for the two 
supported jump sizes and document the actual size of t