Re: [rng] Split and Jump functions

Alex Herbert Wed, 01 May 2019 12:58:04 -0700


> On 1 May 2019, at 16:05, Gilles Sadowski <[email protected]> wrote:
> 
> Hi.
> 
> Le mar. 30 avr. 2019 à 17:08, Alex Herbert <[email protected] 
> <mailto:[email protected]>> a écrit :
>> 
>> On 29/04/2019 22:14, Gilles Sadowski wrote:
>>> Hello.
>>> 
>>> Le lun. 29 avr. 2019 à 19:09, Alex Herbert <[email protected]> a 
>>> écrit :
>>>> On 28/04/2019 19:11, Gilles Sadowski wrote:
>>>>> Le dim. 28 avr. 2019 à 17:02, Alex Herbert <[email protected]> a 
>>>>> écrit :
>>>>>> 
>>>>>>> On 28 Apr 2019, at 00:59, Bernd Eckenfels <[email protected]> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>> Hello,
>>>>>>> 
>>>>>>> Just a question, I am unclear on the terminology, is „jump“ (did I miss 
>>>>>>> the discussion leading toot?) something invented here? It sounds to me 
>>>>>>> like this is a generator where the state can be cloned and it is 
>>>>>>> „seekable“. It probably makes sense to have those two dimensions 
>>>>>>> separated anyway.
>>>>>> Hi Bernd, thanks for the input.
>>>>>> 
>>>>>> This thread started with the definition:
>>>>>> Jump:
>>>>>> 
>>>>>> To create a new instance of the generator that is deterministically 
>>>>>> based on the state of the current instance but advanced a set number of 
>>>>>> iterations.
>>>>>> 
>>>>>> 
>>>>>> However it is not required to create a new instance at the same time as 
>>>>>> jumping. You are correct in that this is two functionalities:
>>>>>> 
>>>>>> 1. Jump forward in the sequence
>>>>>> 2. Copy
>>>>>> 
>>>>>> However the two are coupled. Having jump on its own is useless (why move 
>>>>>> forward in the sequence without using it?). So a copy needs to be 
>>>>>> created somewhere before/after the jump.
>>>>>> 
>>>>>> The idea of a jump is to create a series of the generator at different 
>>>>>> points in the state. The generators can be used for parallel 
>>>>>> computations and will be ensured to not overlap in their output sequence 
>>>>>> for number of outputs skipped by the jump length.
>>>>>> 
>>>>>> FYI. The generators that support this have jump sizes of 2^64, 96, 128, 
>>>>>> 192, 256 and 512. So this is a lot of output sequence to jump.
>>>>>> 
>>>>>> Copy on its own works but for what purpose? If you want a second 
>>>>>> generator at the moment you just create a new one (with a different 
>>>>>> seed). Duplicate copies of generators is prone to potential pitfalls 
>>>>>> where simulations are not as random as you intend. For a special use 
>>>>>> case where you wish to run multiple simulations with the same generator 
>>>>>> you can use the Restorable interface to save the state of one and 
>>>>>> re-create it in other instances.
>>>>>> 
>>>>>> The current thread came to the choice of:
>>>>>> 
>>>>>>>>> So the options are (in all cases returning the copy):
>>>>>>>>> 
>>>>>>>>> 1. createAndJumpCopy
>>>>>>>>> 2. copyAndJumpParent
>>>>>>>>> 3. jumpParentAndCopy
>>>>>>>>> 4. jump and copy separately
>>>>>> Jump and copy separately was ruled out to discourage misuse of copy.
>>>>>> 
>>>>>> The current suggestion is 1. Create a copy and jump that ahead. The 
>>>>>> current instance is not affected.
>>>>>> 
>>>>>> I now consider this to be weaker for a variety of use cases than 2. This 
>>>>>> copies the current state for use and then jumps the parent ahead. So 
>>>>>> this alters the state of the parent generator.
>>>>>> 
>>>>>> Note that all other methods of a generator alter its state. So having 
>>>>>> jump alter its state is reasonable.
>>>>>> 
>>>>>> The most flexible API is to separate jump and copy into two methods. We 
>>>>>> can still support helper functions that take in a Jumpable generator and 
>>>>>> create a jump series of generators for parallel work. Separating jump 
>>>>>> and copy allows the functionality to be used in a larger number of ways 
>>>>>> than any other interface that attempts to combine jump and copy.
>>>>>> 
>>>>>> I am fine with having separate jump and copy. If so the copy method, 
>>>>>> being part of the Jumpable interface, will be functionally coupled with 
>>>>>> the jump method and should be described in Javadoc with the intended 
>>>>>> purpose to use it to copy the parent state either before or after a jump 
>>>>>> into a child generator.
>>>>>> 
>>>>>> As a precursor this API is very flexible:
>>>>>> 
>>>>>> JumpableUniformRandomProvider extends UniformRandomProvider {
>>>>>>      /** Jump and return same instance. */
>>>>>>      JumpableUniformRandomProvider jump();
>>>>>>      /** Copy the instance. */
>>>>>>      JumpableUniformRandomProvider copy();
>>>>>> }
>>>>>> 
>>>>>> Returning the same instance in jump() allows method chaining such as 
>>>>>> either:
>>>>>> 
>>>>>> rng.jump().copy();
>>>>>> rng.copy().jump();
>>>>>> 
>>>>>> This potential pitfall is that a user may do this:
>>>>>> 
>>>>>> UniformRandomProvider rng1 = rng.copy().jump();
>>>>>> UniformRandomProvider rng2 = rng.copy().jump();
>>>>>> 
>>>>>> Where rng1 and 2 will be the same, 1 jump ahead of the parent state. Or:
>>>>>> 
>>>>>> UniformRandomProvider rng1 = rng.jump();
>>>>>> UniformRandomProvider rng2 = rng.jump();
>>>>>> 
>>>>>> Where rng, rng1 and rng2 are the same instance all 2 jumps ahead of the 
>>>>>> start point.
>>>>>> 
>>>>>> I think our purpose is to provide an API for the generators that can 
>>>>>> jump and that is not too restrictive given the use cases we have so far 
>>>>>> thought up. There may be other ideas of use cases that cannot be done 
>>>>>> with the coupled functionality of:
>>>>>> 
>>>>>> JumpableUniformRandomProvider extends UniformRandomProvider {
>>>>>>      /** Copy the instance, then jump ahead. Return the copy of the 
>>>>>> previous state. */
>>>>>>      JumpableUniformRandomProvider jump();
>>>>>> }
>>>>>> 
>>>>>> JumpableUniformRandomProvider extends UniformRandomProvider {
>>>>>>      /** Copy the instance, then jump the copy ahead. Return the copy. 
>>>>>> The current instance is not affected. */
>>>>>>      JumpableUniformRandomProvider jump();
>>>>>> }
>>>>>> 
>>>>>> So the split functions without allowing method chaining:
>>>>>> 
>>>>>> JumpableUniformRandomProvider extends UniformRandomProvider {
>>>>>>      /** Jump the current instance ahead. */
>>>>>>      void jump();
>>>>>>      /** Copy the instance. This is intended to be used either before or 
>>>>>> after a call to jump()
>>>>>>       * to create a series of generators. */
>>>>>>      JumpableUniformRandomProvider copy();
>>>>>> }
>>>>> As you indicated above, there is no advantage in having separate
>>>>> "jump()" and "copy()",
>>>>> as counter-intuitive it may look at first sight.
>>>> So is the favoured approach now to make the jump method return a copy of
>>>> the current state and then increment the state of the current instance
>>>> with a jump, as in:
>>>> 
>>>> JumpableUniformRandomProvider extends UniformRandomProvider {
>>>>      /** Copy the instance, then jump ahead. Return the copy of the 
>>>> previous state. */
>>>>      JumpableUniformRandomProvider jump();
>>>> }
>>>> 
>>>> This would appear to solve more use cases than returning a copy that is
>>>> further ahead. For example the dynamic generation of a jump sequence of
>>>> an unknown size:
>>>> 
>>>> List<UniformRandomProvider> rngs = ...;
>>>> JumpableUniformRandomProvider rng;
>>>> for (Data data : Iterable<Data>) {
>>>>      rngs.add(rng.jump());
>>>> }
>>>> // Do something with all the Data and rngs
>>> I confirm agreement. ;-)
>> OK. I'll modify the Jira tickets to match.
>>> 
>>>> I note that if this is the functionality then the jump() method can be
>>>> written to return a UniformRandomProvider:
>>>> 
>>>> JumpableUniformRandomProvider extends UniformRandomProvider {
>>>>      /** Copy the instance, then jump ahead. Return the copy of the 
>>>> previous state. */
>>>>      UniformRandomProvider jump();
>>>> }
>>>> 
>>>> It would not have to be enforced to remove jump functionality from the
>>>> copy but should make it clear that the returned object is not to be
>>>> jumped again.
>>> Yes, but a cast could work around the artificial limitation of
>>> the return type, while "restrict" prevents misuse.
>> 
>> My point was which of the following do we want the interface to support:
>> 
>> 1. Return JumpableUniformRandomProvider (i.e. a direct copy)
>> 
>> 2. Return UniformRandomProvider (this could be a copy that could be cast
>> to Jumpable)
>> 
>> 3. Really return a UniformRandomProvider by enforcing the removal of the
>> jump function
>> 
>> For option 3 we will have to provide a helper method somewhere to do
>> this. Implementations should then use this to wrap a return copy.
>> 
>> However removal of the jump function by restriction to just
>> UniformRandomProvider will also remove support for
>> RestorableUniformRandomProvider, and any other interface the
>> implementation is providing.
>> 
>> I am thinking that option 2 is OK. It would not restrict the returned
>> copy but subtly suggests that the copy is not to be used for jumping.
> 
> OK. [Assuming that "jumping" main (only?) case is for a user to speed
> up computations, so that it is his responsiblity to not misuse (e.g. "jump"
> again) the returned instance.]


Jumping is for parallel work from a top down hierarchy of known size. Anyone 
jumping a RNG should know this and the pitfalls of jumping again (and 
potentially jumping into the region of another RNG output).

If you have a top down hierarchy of unknown size then this is where you need 
split. But as previously discussed, although simple to implement for many of 
the providers, it is no better than just creating a new instance using the 
RandomSource factory method other than to avoid synchronisation overhead. When 
a user requires this functionality then perhaps it can be revisited.


> 
>> We can enforce actual restriction in the helper method to be added to
>> RandomSource to create a jump series of RNGs from a Jumpable one. But
>> even then it should be optional (which you previously suggested) since
>> downstream consumers of the RNG may want Restore (or Jump) functionality.
> 
> No, when getting a "restricted" instance, downstream consumers should
> have no way to override it, especially in cases where they could mess
> other parts of the code.

I meant that the method for creating a series can optionally restrict. I agree 
that a restricted RNG is exactly that; no other functionality is allowed or 
exposed. However the method to create a series is probably redundant as you 
point out below.

> 
>> Consider that this could be solved by a smart restriction method that
>> takes an EnumSet of the interfaces that should be exposed. We then
>> support combinations of the various interfaces. So far we have:
>> 
>> UniformRandomProvider (always)
>> 
>> RestorableUniformRandomProvider (2 methods)
>> 
>> JumpableUniformRandomProvider (1 method)
>> 
>> LongJumpableUniformRandomProvider (1 method)
>> 
>> The total number of combinations of the 3 optional interfaces is 2^3 ==
>> 8 (the 3rd row from Pascal's triangle). Although since you can't be
>> LongJumpable and not Jumpable this drops to 6. Anyway it will get out of
>> hand if more interfaces are added. The point being choosing what to
>> restrict to has many options.
>> 
>> My suggestion would be to not enforce restriction in the interface for
>> jumps but make the return value the next level up the hierarchy:
>> 
>> longJump returns JumpableUniformRandomProvider
>> 
>> jump returns UniformRandomProvider
> 
> Yes, that seems interesting from a usage POV.
> 
>> Then in the helper method to create a jump series the returned array can
>> be the same type (UniformRandomProvider or
>> JumpableUniformRandomProvider) and the method has an option to enforce
>> this interface by wrapping.
> 
> But IIUC, the now preferred implementation make the convenience
> method not necessary (?).

It seems so. The need to precompute an array of RNGs using jump is satisfied by 
using the master RNG as a supplier of RNGs with the jump method. You can then 
create on demand a series of any length using the return value from jump.

It makes it easier to leave out extra methods from RandomSource to create a 
series. The only method perhaps desired in RandomSource is:

UniformRandomProvider restrict(JumpableUniformRandomProvider);

Akin to the current UniformRandomProvider 
unrestorable(RestorableUniformRandomProvider);

So do we do:

UniformRandomProvider restrict(JumpableUniformRandomProvider);
JumpableUniformRandomProvider restrict(LongJumpableUniformRandomProvider);
UniformRandomProvider restrict(RestorableUniformRandomProvider);

Or:

UniformRandomProvider unjumpable(JumpableUniformRandomProvider);
JumpableUniformRandomProvider unlongJumpable(LongJumpableUniformRandomProvider);
UniformRandomProvider unrestorable(RestorableUniformRandomProvider);

The later option only adds two new methods. The first has 3 new methods 
(deprecating unrestorable with restrict) but suffers from having to cast 
instances of multiple interfaces to ensure the correct restrict is called. So 
this makes me favour the verbosely named option.

Alex


> 
> Best,
> Gilles
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected] 
> <mailto:[email protected]>
> For additional commands, e-mail: [email protected] 
> <mailto:[email protected]>

Re: [rng] Split and Jump functions

Reply via email to