Re: [rng][lang] Shuffling arrays

2022-12-06 Thread Emmanuel Bourg
I would rather keep the ArrayUtils.shuffle() methods not deprecated, and 
mention RNG in the Javadoc for more advanced usages. Adding a 100KB 
dependency just to shuffle an array isn't optimal.


Emmanuel Bourg


Le 06/12/2022 à 21:40, Gary Gregory a écrit :

I am ok with both LANG and TEXT deprecating to RNG.

Gary

On Tue, Dec 6, 2022, 13:21 Alex Herbert  wrote:


On Tue, 6 Dec 2022 at 17:22, Gary Gregory  wrote:


I agree this should be in rng.

Does rng duplicate all of the lang APIs such that we can deprecate the

lang

methods?


In short, yes.

(cd src/main && git grep -c Random)
- ArrayUtils
- RandomStringUtils
- RandomUtils

The proposed ArraySampler with shuffle methods for all array types
would deprecate ArrayUtils.shuffle. You would have to provide a
UniformRandomProvider in place of a java.util.Random.

RandomStringUtils is not explicitly deprecated. However the class
javadoc states that Commons Text's RandomStringGenerator and,
generically, Commons RNG are more suitable for random generation.

RandomUtils is already deprecated. It mentions RNG in the header but
the functionality is for static thread-safe calls for random numbers.
The RandomUtils class is partly deprecated by changing calls from
RandomUtils.xxx to ThreadLocalRandom.current().xxx. The class uses
ThreadLocalRandom under the hood, but does not act as a pass-through
for all methods. It looks like these could be updated to directly use
ThreadLocalRandom's implementation:

nextLong(long upper)

Note: This method in RandomUtils does not check upper > 0 which is a bug.

However the bounds for some methods are different, some have extra
conditions and some are missing from ThreadLocalRandom.

method : lang : ThreadLocalRandom
nextBytes(int) : present : NA
nextDouble() : [0, MAX_VALUE) : [0, 1)
nextDouble(lower, upper) : [lower, upper) | upper > lower >= 0 : upper >
lower
nextFloat() : [0, MAX_VALUE) : [0, 1)
nextFloat(lower, upper) : [lower, upper) | upper > lower >= 0 : NA
nextInt() : [0, MAX_VALUE) : [MIN_VALUE, MAX_VALUE]
nextInt(upper) : NA : [0, upper)
nextInt(lower, upper) : [lower, upper) | upper > lower >= 0 : [lower,
upper) | upper > lower
nextLong() : [0, MAX_VALUE) : [MIN_VALUE, MAX_VALUE]
nextLong(upper) : [0, upper) [no check upper > 0] : [0, upper)
nextLong(lower upper) : [lower, upper) | upper > lower >= 0 : [lower,
upper) | upper > lower

All these methods are in the UniformRandomProvider interface from
[rng], including the nextFloat with ranges but with the exception of
nextBytes(int count). The generators provide nextBytes(byte[]) and you
must supply the array.

In this case it may be helpful to document each method with an
equivalent from ThreadLocalRandom that provides a thread-safe static
call to generate the same output (with the exception that lower bounds
can be negative in ThreadLocalRandom).

Alex

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org







-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [rng][lang] Shuffling arrays

2022-12-06 Thread Gary Gregory
I am ok with both LANG and TEXT deprecating to RNG.

Gary

On Tue, Dec 6, 2022, 13:21 Alex Herbert  wrote:

> On Tue, 6 Dec 2022 at 17:22, Gary Gregory  wrote:
> >
> > I agree this should be in rng.
> >
> > Does rng duplicate all of the lang APIs such that we can deprecate the
> lang
> > methods?
>
> In short, yes.
>
> (cd src/main && git grep -c Random)
> - ArrayUtils
> - RandomStringUtils
> - RandomUtils
>
> The proposed ArraySampler with shuffle methods for all array types
> would deprecate ArrayUtils.shuffle. You would have to provide a
> UniformRandomProvider in place of a java.util.Random.
>
> RandomStringUtils is not explicitly deprecated. However the class
> javadoc states that Commons Text's RandomStringGenerator and,
> generically, Commons RNG are more suitable for random generation.
>
> RandomUtils is already deprecated. It mentions RNG in the header but
> the functionality is for static thread-safe calls for random numbers.
> The RandomUtils class is partly deprecated by changing calls from
> RandomUtils.xxx to ThreadLocalRandom.current().xxx. The class uses
> ThreadLocalRandom under the hood, but does not act as a pass-through
> for all methods. It looks like these could be updated to directly use
> ThreadLocalRandom's implementation:
>
> nextLong(long upper)
>
> Note: This method in RandomUtils does not check upper > 0 which is a bug.
>
> However the bounds for some methods are different, some have extra
> conditions and some are missing from ThreadLocalRandom.
>
> method : lang : ThreadLocalRandom
> nextBytes(int) : present : NA
> nextDouble() : [0, MAX_VALUE) : [0, 1)
> nextDouble(lower, upper) : [lower, upper) | upper > lower >= 0 : upper >
> lower
> nextFloat() : [0, MAX_VALUE) : [0, 1)
> nextFloat(lower, upper) : [lower, upper) | upper > lower >= 0 : NA
> nextInt() : [0, MAX_VALUE) : [MIN_VALUE, MAX_VALUE]
> nextInt(upper) : NA : [0, upper)
> nextInt(lower, upper) : [lower, upper) | upper > lower >= 0 : [lower,
> upper) | upper > lower
> nextLong() : [0, MAX_VALUE) : [MIN_VALUE, MAX_VALUE]
> nextLong(upper) : [0, upper) [no check upper > 0] : [0, upper)
> nextLong(lower upper) : [lower, upper) | upper > lower >= 0 : [lower,
> upper) | upper > lower
>
> All these methods are in the UniformRandomProvider interface from
> [rng], including the nextFloat with ranges but with the exception of
> nextBytes(int count). The generators provide nextBytes(byte[]) and you
> must supply the array.
>
> In this case it may be helpful to document each method with an
> equivalent from ThreadLocalRandom that provides a thread-safe static
> call to generate the same output (with the exception that lower bounds
> can be negative in ThreadLocalRandom).
>
> Alex
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


Re: [rng][lang] Shuffling arrays

2022-12-06 Thread Alex Herbert
On Tue, 6 Dec 2022 at 17:22, Gary Gregory  wrote:
>
> I agree this should be in rng.
>
> Does rng duplicate all of the lang APIs such that we can deprecate the lang
> methods?

In short, yes.

(cd src/main && git grep -c Random)
- ArrayUtils
- RandomStringUtils
- RandomUtils

The proposed ArraySampler with shuffle methods for all array types
would deprecate ArrayUtils.shuffle. You would have to provide a
UniformRandomProvider in place of a java.util.Random.

RandomStringUtils is not explicitly deprecated. However the class
javadoc states that Commons Text's RandomStringGenerator and,
generically, Commons RNG are more suitable for random generation.

RandomUtils is already deprecated. It mentions RNG in the header but
the functionality is for static thread-safe calls for random numbers.
The RandomUtils class is partly deprecated by changing calls from
RandomUtils.xxx to ThreadLocalRandom.current().xxx. The class uses
ThreadLocalRandom under the hood, but does not act as a pass-through
for all methods. It looks like these could be updated to directly use
ThreadLocalRandom's implementation:

nextLong(long upper)

Note: This method in RandomUtils does not check upper > 0 which is a bug.

However the bounds for some methods are different, some have extra
conditions and some are missing from ThreadLocalRandom.

method : lang : ThreadLocalRandom
nextBytes(int) : present : NA
nextDouble() : [0, MAX_VALUE) : [0, 1)
nextDouble(lower, upper) : [lower, upper) | upper > lower >= 0 : upper > lower
nextFloat() : [0, MAX_VALUE) : [0, 1)
nextFloat(lower, upper) : [lower, upper) | upper > lower >= 0 : NA
nextInt() : [0, MAX_VALUE) : [MIN_VALUE, MAX_VALUE]
nextInt(upper) : NA : [0, upper)
nextInt(lower, upper) : [lower, upper) | upper > lower >= 0 : [lower,
upper) | upper > lower
nextLong() : [0, MAX_VALUE) : [MIN_VALUE, MAX_VALUE]
nextLong(upper) : [0, upper) [no check upper > 0] : [0, upper)
nextLong(lower upper) : [lower, upper) | upper > lower >= 0 : [lower,
upper) | upper > lower

All these methods are in the UniformRandomProvider interface from
[rng], including the nextFloat with ranges but with the exception of
nextBytes(int count). The generators provide nextBytes(byte[]) and you
must supply the array.

In this case it may be helpful to document each method with an
equivalent from ThreadLocalRandom that provides a thread-safe static
call to generate the same output (with the exception that lower bounds
can be negative in ThreadLocalRandom).

Alex

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [rng][lang] Shuffling arrays

2022-12-06 Thread Gary Gregory
I agree this should be in rng.

Does rng duplicate all of the lang APIs such that we can deprecate the lang
methods?

Gary


On Tue, Dec 6, 2022, 09:36 Alex Herbert  wrote:

> Currently the [rng] sampler package can only shuffle primitive int[]
> arrays:
>
> o.a.c.rng.sampling.PermutationSampler:
>
> public static void shuffle(UniformRandomProvider rng, int[] list)
> public static void shuffle(UniformRandomProvider rng,
>int[] list,
>int start,
>boolean towardHead)
>
> I would like to be able to shuffle other arrays such as double[].
> There is actually this functionality in [Lang] o.a.c.lang3.ArrayUtils.
> However it uses java.util.Random for the random source, and does not
> support a sub-range, e.g.
>
> public static void shuffle(final byte[] array)
> public static void shuffle(final byte[] array, final Random random)
>
> I suggest an API that requires UniformRandomProvider and can handle
> sub-ranges as:
>
> public static void shuffle(UniformRandomProvider rng, int[] data);
> public static void shuffle(UniformRandomProvider rng, int[] data, int
> from, int to);
> Or (similar to java.util.Arrays.copyOfRange):
> public static void shuffleOfRange(UniformRandomProvider rng, int[]
> data, int from, int to);
>
> This can be repeated for all 8 primitive types and generic type T.
>
> I suggest putting this in the sampling package but under what class?
> Note that all public class names in the sampling package currently end
> in Sampler. I would suggest ArraySampler.
>
> Note there is currently a ListSampler which has generic methods to
> return List samples from a list, and shuffle lists. So adding
> ArraySampler with only shuffling would be missing equivalent sample
> methods. Consistency would require adding 8 variations of sample and a
> generic one:
>
> public static double[] sample(UniformRandomProvider rng,
> double[] array,
> int k) {
> public static  T[] sample(UniformRandomProvider rng,
> T[] array,
> int k)
>
> I have no use case for this but can add it for completeness.
>
> Alex
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


Re: [rng][lang] Shuffling arrays

2022-12-06 Thread Alex Herbert
On Tue, 6 Dec 2022 at 14:38, Bruno Kinoshita  wrote:
>
> Hi Alex,
>
> I also don't have a use case for this right now. What about creating a JIRA
> issue to wait to see if someone has the need for this feature? Maybe users
> will confirm they need it, or provide other suggestions?
>
> -Bruno

I do have a use case for shuffling primitive arrays. But not for
sampling from primitive arrays. So I can create a jira issues for:

1. Add an ArraySampler with shuffle methods (to implement)
2. Add sampling methods to the ArraySampler (TBD)

Alex

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [rng][lang] Shuffling arrays

2022-12-06 Thread Bruno Kinoshita
Hi Alex,

I also don't have a use case for this right now. What about creating a JIRA
issue to wait to see if someone has the need for this feature? Maybe users
will confirm they need it, or provide other suggestions?

-Bruno


On Tue, 6 Dec 2022 at 15:36, Alex Herbert  wrote:

> Currently the [rng] sampler package can only shuffle primitive int[]
> arrays:
>
> o.a.c.rng.sampling.PermutationSampler:
>
> public static void shuffle(UniformRandomProvider rng, int[] list)
> public static void shuffle(UniformRandomProvider rng,
>int[] list,
>int start,
>boolean towardHead)
>
> I would like to be able to shuffle other arrays such as double[].
> There is actually this functionality in [Lang] o.a.c.lang3.ArrayUtils.
> However it uses java.util.Random for the random source, and does not
> support a sub-range, e.g.
>
> public static void shuffle(final byte[] array)
> public static void shuffle(final byte[] array, final Random random)
>
> I suggest an API that requires UniformRandomProvider and can handle
> sub-ranges as:
>
> public static void shuffle(UniformRandomProvider rng, int[] data);
> public static void shuffle(UniformRandomProvider rng, int[] data, int
> from, int to);
> Or (similar to java.util.Arrays.copyOfRange):
> public static void shuffleOfRange(UniformRandomProvider rng, int[]
> data, int from, int to);
>
> This can be repeated for all 8 primitive types and generic type T.
>
> I suggest putting this in the sampling package but under what class?
> Note that all public class names in the sampling package currently end
> in Sampler. I would suggest ArraySampler.
>
> Note there is currently a ListSampler which has generic methods to
> return List samples from a list, and shuffle lists. So adding
> ArraySampler with only shuffling would be missing equivalent sample
> methods. Consistency would require adding 8 variations of sample and a
> generic one:
>
> public static double[] sample(UniformRandomProvider rng,
> double[] array,
> int k) {
> public static  T[] sample(UniformRandomProvider rng,
> T[] array,
> int k)
>
> I have no use case for this but can add it for completeness.
>
> Alex
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>