+1 to updating RNG to Java 8

Gary

On Fri, Jun 14, 2019 at 10:45 AM Alex Herbert <alex.d.herb...@gmail.com>
wrote:

>
> On 14/06/2019 13:29, sebb wrote:
> > On Fri, 14 Jun 2019 at 12:57, Alex Herbert <alex.d.herb...@gmail.com>
> wrote:
> >>
> >> On 14/06/2019 12:01, sebb wrote:
> >>> I meant that the iterator would use the shuffled and/or selected
> >>> indices to return the appropriate entry from the original array.
> >>>
> >>> No need to modify the original array.
> >> To shuffle an array requires storage as elements are swapped. Either
> >> store an index or store the data. So for primitive types with 4-bytes or
> >> less storage it is more efficient (memory wise) to copy the array and
> >> shuffle in place. For those with size of 8-bytes then you can create an
> >> int array of indices and shuffle that. The index can be used to take the
> >> sample from the original array.
> >>
> >> In terms of the RNG library the functionality to shuffle an entire array
> >> once (as per ArrayUtils) would be the same as the functions from [lang]
> >> ArrayUtils. So a decision should be made on whether to copy that
> >> functionality from [lang] and mark it as deprecated there.
> >>
> >> The idea of unlimited shuffling would be done as a sampler so this would
> >> go into the rng-sampling module. A single shuffle creates a permutation
> >> of the sequence. The sample would do this dynamically and repeatedly
> >> pass over the array.
> >>
> >> The sort of code would be:
> >>
> >> double[] array = { 1.1, 2.2, 3.3 };
> >> UniformRandomProvider rng = ...;
> >> DoubleContinuousPermutationSampler sampler = new
> >> DoubleContinuousPermutationSampler(rng, array);
> >> for (int i = 0; i < 100; i++) {
> >>       double sample = sampler.next();
> >> }
> >>
> >> The output would be the n values of the array in a random order without
> >> replacement. Once the entire set n is produced then the output would
> >> repeat in a new random order, and so on:
> >>
> >> 2.2, 1.1, 3.3, 3.3, 2.2, 1.1, 2.2, 3.3, 1.1, ...
> >>
> >> Currently the library does not specialise for primitives so a more
> >> generic sampler would only output a stream of int indices:
> >>
> >> double[] array = { 1.1, 2.2, 3.3 };
> >> UniformRandomProvider rng = ...;
> >> ContinuousPermutationSampler sampler = new
> >> ContinuousPermutationSampler(rng, array.length);
> >> for (int i = 0; i < 100; i++) {
> >>       double sample = array[sampler.next()];
> >> }
> >>
> >> If you are only interested in a complete single pass then creating a
> >> shuffled int[] of indices can be done using the existing
> >> PermutationSampler. This returns an int[] permutation from its sample
> >> method. The array can be used in place of an iterator:
> >>
> >> double[] array = { 1.1, 2.2, 3.3 };
> >> UniformRandomProvider rng = ...;
> >> for (int index : new PermutationSampler(rng, array.length,
> >> array.length).sample()) {
> >>       double sample = array[index];
> >> }
> > My proposal was basically to implement the above in wrapper methods
> > using an interface such as List or Iterator.
> You then have boxing of primitives. Commons RNG is not on JDK 8 so we
> cannot use the primitive specialisations of iterator for int, long and
> double.
>
> The basic Iterator does not support more functionality than a for loop
> over a primitive array. The extra from iterator is remove() which is not
> functionality we are discussing.
>
> This would be nice:
>
> IntStream PermutationSampler.stream(UniformRandomProvider rng, int n)
>
> The stream would be shuffled indices. Without JDK 8 you use the 1 liner:
>
> IntStream.of(new PermutationSampler(rng, array.length,
> array.length).sample());
>
> But with the lack of support for Stream materialisation when a
> terminating operation is called (i.e. the shuffle will be done even when
> the stream is not consumed). Ensuring lazy initialisation would require
> some atomic state in a PrimitiveIterator.OfInt implementation used to
> construct the stream. Any speed advantage of parallel streams would not
> help the shuffle as the full shuffle must be done before sub sections of
> the shuffled array can be used.
>
> I'd put all this on the maybe pile.
>
> >
> >> I'm not opposed to the continuous permutation sampler idea. I just can't
> >> think of a use case not already satisfied by the existing
> >> PermutationSampler, i.e. when you want to sample permutations using each
> >> index one at a time and not in chunks of indices. This type of idea:
> >>
> >> double[] array = { 1.1, 2.2, 3.3 };
> >> UniformRandomProvider rng;
> >> // The PermutationSampler will validate the length is >0
> >> final PermutationSampler sampler = new PermutationSampler(rng,
> >> array.length, array.length);
> >> IntSupplier supplier = new IntSupplier() {
> >>       int i;
> >>       int[] sample;
> >>       @Override
> >>       public int getAsInt() {
> >>           if (i == 0) {
> >>               sample = sampler.sample();
> >>               i = sample.length;
> >>           }
> >>           return sample[--i];
> >>       }
> >> };
> >> for (;;) {
> >>       double sample = array[supplier.getAsInt()];
> >> }
> >>
> >>
> >>> On Thu, 13 Jun 2019 at 18:15, Eric Barnhill <ericbarnh...@gmail.com>
> wrote:
> >>>> An iterator that dynamically shuffles as you go along. That's really
> nice,
> >>>> I had never even thought of that. Thanks.
> >>>>
> >>>> On Thu, Jun 13, 2019 at 10:11 AM Alex Herbert <
> alex.d.herb...@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> On 13/06/2019 17:56, Eric Barnhill wrote:
> >>>>>> On Thu, Jun 13, 2019 at 9:36 AM sebb <seb...@gmail.com> wrote:
> >>>>>>
> >>>>>>> Rather than shuffle etc in place, how about various
> >>>>>>> iterators/selectors to return entries in randomised order?
> >>>>>>> [Or does that already exist?]
> >>>>>>>
> >>>>>> I am pretty sure random draws, and shuffling, are implemented with
> >>>>>> different algorithms. Though sampling without replacement the full
> length
> >>>>>> of the set would yield a shuffled set, I think there are more
> efficient
> >>>>>> ways to shuffle a set.
> >>>>> Iterators to return a random draw *without* replacement over the full
> >>>>> length of the array? The iterator would dynamically shuffle the
> array on
> >>>>> each call to next() so could be stopped early or can be called
> >>>>> infinitely as if a continuous stream. Is that your idea?
> >>>>>
> >>>>> UniformRandomProvider rng = ...;
> >>>>> int[] big = new int[1000000];
> >>>>> //
> >>>>> // Fill big with lots of data
> >>>>> //
> >>>>> IntIterator iter = ShuffleIterators.create(rng, big);
> >>>>> int x = iter.next();
> >>>>> int y = iter.next();
> >>>>> int z = iter.next();
> >>>>>
> >>>>> This doesn't exist but it is easy to do. Memory requirements would
> >>>>> require a copy of the data, or it could be marked as destructive to
> the
> >>>>> input array order and shuffle in place.
> >>>>>
> >>>>> If you want a random draw *with* replacement then you can just call
> >>>>> nextInt(int) with the size of the array to pick something.
> >>>>>
> >>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >>>>> For additional commands, e-mail: dev-h...@commons.apache.org
> >>>>>
> >>>>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >>> For additional commands, e-mail: dev-h...@commons.apache.org
> >>>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >> For additional commands, e-mail: dev-h...@commons.apache.org
> >>
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
>

Reply via email to