On Fri, 14 Jun 2019 at 12:57, Alex Herbert <alex.d.herb...@gmail.com> wrote: > > > On 14/06/2019 12:01, sebb wrote: > > I meant that the iterator would use the shuffled and/or selected > > indices to return the appropriate entry from the original array. > > > > No need to modify the original array. > > To shuffle an array requires storage as elements are swapped. Either > store an index or store the data. So for primitive types with 4-bytes or > less storage it is more efficient (memory wise) to copy the array and > shuffle in place. For those with size of 8-bytes then you can create an > int array of indices and shuffle that. The index can be used to take the > sample from the original array. > > In terms of the RNG library the functionality to shuffle an entire array > once (as per ArrayUtils) would be the same as the functions from [lang] > ArrayUtils. So a decision should be made on whether to copy that > functionality from [lang] and mark it as deprecated there. > > The idea of unlimited shuffling would be done as a sampler so this would > go into the rng-sampling module. A single shuffle creates a permutation > of the sequence. The sample would do this dynamically and repeatedly > pass over the array. > > The sort of code would be: > > double[] array = { 1.1, 2.2, 3.3 }; > UniformRandomProvider rng = ...; > DoubleContinuousPermutationSampler sampler = new > DoubleContinuousPermutationSampler(rng, array); > for (int i = 0; i < 100; i++) { > double sample = sampler.next(); > } > > The output would be the n values of the array in a random order without > replacement. Once the entire set n is produced then the output would > repeat in a new random order, and so on: > > 2.2, 1.1, 3.3, 3.3, 2.2, 1.1, 2.2, 3.3, 1.1, ... > > Currently the library does not specialise for primitives so a more > generic sampler would only output a stream of int indices: > > double[] array = { 1.1, 2.2, 3.3 }; > UniformRandomProvider rng = ...; > ContinuousPermutationSampler sampler = new > ContinuousPermutationSampler(rng, array.length); > for (int i = 0; i < 100; i++) { > double sample = array[sampler.next()]; > } > > If you are only interested in a complete single pass then creating a > shuffled int[] of indices can be done using the existing > PermutationSampler. This returns an int[] permutation from its sample > method. The array can be used in place of an iterator: > > double[] array = { 1.1, 2.2, 3.3 }; > UniformRandomProvider rng = ...; > for (int index : new PermutationSampler(rng, array.length, > array.length).sample()) { > double sample = array[index]; > }
My proposal was basically to implement the above in wrapper methods using an interface such as List or Iterator. > I'm not opposed to the continuous permutation sampler idea. I just can't > think of a use case not already satisfied by the existing > PermutationSampler, i.e. when you want to sample permutations using each > index one at a time and not in chunks of indices. This type of idea: > > double[] array = { 1.1, 2.2, 3.3 }; > UniformRandomProvider rng; > // The PermutationSampler will validate the length is >0 > final PermutationSampler sampler = new PermutationSampler(rng, > array.length, array.length); > IntSupplier supplier = new IntSupplier() { > int i; > int[] sample; > @Override > public int getAsInt() { > if (i == 0) { > sample = sampler.sample(); > i = sample.length; > } > return sample[--i]; > } > }; > for (;;) { > double sample = array[supplier.getAsInt()]; > } > > > > > > On Thu, 13 Jun 2019 at 18:15, Eric Barnhill <ericbarnh...@gmail.com> wrote: > >> An iterator that dynamically shuffles as you go along. That's really nice, > >> I had never even thought of that. Thanks. > >> > >> On Thu, Jun 13, 2019 at 10:11 AM Alex Herbert <alex.d.herb...@gmail.com> > >> wrote: > >> > >>> On 13/06/2019 17:56, Eric Barnhill wrote: > >>>> On Thu, Jun 13, 2019 at 9:36 AM sebb <seb...@gmail.com> wrote: > >>>> > >>>>> Rather than shuffle etc in place, how about various > >>>>> iterators/selectors to return entries in randomised order? > >>>>> [Or does that already exist?] > >>>>> > >>>> I am pretty sure random draws, and shuffling, are implemented with > >>>> different algorithms. Though sampling without replacement the full length > >>>> of the set would yield a shuffled set, I think there are more efficient > >>>> ways to shuffle a set. > >>> Iterators to return a random draw *without* replacement over the full > >>> length of the array? The iterator would dynamically shuffle the array on > >>> each call to next() so could be stopped early or can be called > >>> infinitely as if a continuous stream. Is that your idea? > >>> > >>> UniformRandomProvider rng = ...; > >>> int[] big = new int[1000000]; > >>> // > >>> // Fill big with lots of data > >>> // > >>> IntIterator iter = ShuffleIterators.create(rng, big); > >>> int x = iter.next(); > >>> int y = iter.next(); > >>> int z = iter.next(); > >>> > >>> This doesn't exist but it is easy to do. Memory requirements would > >>> require a copy of the data, or it could be marked as destructive to the > >>> input array order and shuffle in place. > >>> > >>> If you want a random draw *with* replacement then you can just call > >>> nextInt(int) with the size of the array to pick something. > >>> > >>> > >>> --------------------------------------------------------------------- > >>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > >>> For additional commands, e-mail: dev-h...@commons.apache.org > >>> > >>> > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > > For additional commands, e-mail: dev-h...@commons.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org