Re: [lang][rng] org.apache.commons.lang3.ArrayUtils.shuffle()

sebb Fri, 14 Jun 2019 05:29:54 -0700

On Fri, 14 Jun 2019 at 12:57, Alex Herbert <alex.d.herb...@gmail.com> wrote:
>
>
> On 14/06/2019 12:01, sebb wrote:
> > I meant that the iterator would use the shuffled and/or selected
> > indices to return the appropriate entry from the original array.
> >
> > No need to modify the original array.
>
> To shuffle an array requires storage as elements are swapped. Either
> store an index or store the data. So for primitive types with 4-bytes or
> less storage it is more efficient (memory wise) to copy the array and
> shuffle in place. For those with size of 8-bytes then you can create an
> int array of indices and shuffle that. The index can be used to take the
> sample from the original array.
>
> In terms of the RNG library the functionality to shuffle an entire array
> once (as per ArrayUtils) would be the same as the functions from [lang]
> ArrayUtils. So a decision should be made on whether to copy that
> functionality from [lang] and mark it as deprecated there.
>
> The idea of unlimited shuffling would be done as a sampler so this would
> go into the rng-sampling module. A single shuffle creates a permutation
> of the sequence. The sample would do this dynamically and repeatedly
> pass over the array.
>
> The sort of code would be:
>
> double[] array = { 1.1, 2.2, 3.3 };
> UniformRandomProvider rng = ...;
> DoubleContinuousPermutationSampler sampler = new
> DoubleContinuousPermutationSampler(rng, array);
> for (int i = 0; i < 100; i++) {
>      double sample = sampler.next();
> }
>
> The output would be the n values of the array in a random order without
> replacement. Once the entire set n is produced then the output would
> repeat in a new random order, and so on:
>
> 2.2, 1.1, 3.3, 3.3, 2.2, 1.1, 2.2, 3.3, 1.1, ...
>
> Currently the library does not specialise for primitives so a more
> generic sampler would only output a stream of int indices:
>
> double[] array = { 1.1, 2.2, 3.3 };
> UniformRandomProvider rng = ...;
> ContinuousPermutationSampler sampler = new
> ContinuousPermutationSampler(rng, array.length);
> for (int i = 0; i < 100; i++) {
>      double sample = array[sampler.next()];
> }
>
> If you are only interested in a complete single pass then creating a
> shuffled int[] of indices can be done using the existing
> PermutationSampler. This returns an int[] permutation from its sample
> method. The array can be used in place of an iterator:
>
> double[] array = { 1.1, 2.2, 3.3 };
> UniformRandomProvider rng = ...;
> for (int index : new PermutationSampler(rng, array.length,
> array.length).sample()) {
>      double sample = array[index];
> }


My proposal was basically to implement the above in wrapper methods
using an interface such as List or Iterator.

> I'm not opposed to the continuous permutation sampler idea. I just can't
> think of a use case not already satisfied by the existing
> PermutationSampler, i.e. when you want to sample permutations using each
> index one at a time and not in chunks of indices. This type of idea:
>
> double[] array = { 1.1, 2.2, 3.3 };
> UniformRandomProvider rng;
> // The PermutationSampler will validate the length is >0
> final PermutationSampler sampler = new PermutationSampler(rng,
> array.length, array.length);
> IntSupplier supplier = new IntSupplier() {
>      int i;
>      int[] sample;
>      @Override
>      public int getAsInt() {
>          if (i == 0) {
>              sample = sampler.sample();
>              i = sample.length;
>          }
>          return sample[--i];
>      }
> };
> for (;;) {
>      double sample = array[supplier.getAsInt()];
> }
>
>
> >
> > On Thu, 13 Jun 2019 at 18:15, Eric Barnhill <ericbarnh...@gmail.com> wrote:
> >> An iterator that dynamically shuffles as you go along. That's really nice,
> >> I had never even thought of that. Thanks.
> >>
> >> On Thu, Jun 13, 2019 at 10:11 AM Alex Herbert <alex.d.herb...@gmail.com>
> >> wrote:
> >>
> >>> On 13/06/2019 17:56, Eric Barnhill wrote:
> >>>> On Thu, Jun 13, 2019 at 9:36 AM sebb <seb...@gmail.com> wrote:
> >>>>
> >>>>> Rather than shuffle etc in place, how about various
> >>>>> iterators/selectors to return entries in randomised order?
> >>>>> [Or does that already exist?]
> >>>>>
> >>>> I am pretty sure random draws, and shuffling, are implemented with
> >>>> different algorithms. Though sampling without replacement the full length
> >>>> of the set would yield a shuffled set, I think there are more efficient
> >>>> ways to shuffle a set.
> >>> Iterators to return a random draw *without* replacement over the full
> >>> length of the array? The iterator would dynamically shuffle the array on
> >>> each call to next() so could be stopped early or can be called
> >>> infinitely as if a continuous stream. Is that your idea?
> >>>
> >>> UniformRandomProvider rng = ...;
> >>> int[] big = new int[1000000];
> >>> //
> >>> // Fill big with lots of data
> >>> //
> >>> IntIterator iter = ShuffleIterators.create(rng, big);
> >>> int x = iter.next();
> >>> int y = iter.next();
> >>> int z = iter.next();
> >>>
> >>> This doesn't exist but it is easy to do. Memory requirements would
> >>> require a copy of the data, or it could be marked as destructive to the
> >>> input array order and shuffle in place.
> >>>
> >>> If you want a random draw *with* replacement then you can just call
> >>> nextInt(int) with the size of the array to pick something.
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >>> For additional commands, e-mail: dev-h...@commons.apache.org
> >>>
> >>>
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [lang][rng] org.apache.commons.lang3.ArrayUtils.shuffle()

Reply via email to