[ 
https://issues.apache.org/jira/browse/RNG-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16695893#comment-16695893
 ] 

Alex D Herbert commented on RNG-57:
-----------------------------------

{quote}I'm not sure: What ensures that the previous implementation to retrieve 
a boolean from the underlying RNG is better?
{quote}
I'd not considered this. The only way to know would be to test it. This could 
be done by targeting the specific tests for RandomWalk within BigCrush using 
only the first bit (i.e. dropping all the other bits). This is possible using 
the C library API. Or build a RNG that composes an {{int}} using 32 calls to 
the underlying source using the upper bit from each source. This could be 
tested using the existing framework with all the test suites.

It is simpler to just advise against using poor RNGs and stick to those with 
low errors from BigCrush.

For systematic errors I have ignored dieharder_sums. I store a failure in each 
occurrence of the same test. For BigCrush the test has a test Id. This is the 
same even though the test may have multiple parts, e.g. this is one test 
failure:
{noformat}
 74  RandomWalk1 H (L=50, r=0)        eps  
 74  RandomWalk1 M (L=50, r=0)        eps  
 74  RandomWalk1 J (L=50, r=0)        eps  
 74  RandomWalk1 R (L=50, r=0)        eps  
 74  RandomWalk1 C (L=50, r=0)        eps  
{noformat}
For Dieharder the test has a name then a parameter (ntup), e.g. this is two 
tests:
{noformat}
      rgb_lagged_sum|  14|   1000000|     100|0.95188466|  PASSED  
      rgb_lagged_sum|  15|   1000000|     100|0.00000000|  FAILED  
{noformat}
Here's the old systematic failures table using the original txt results files 
for the {{LongProviders}}:
||RNG identifier||Test suite||Systematic failures||Fails||
|MT_64|TestU01 (BigCrush)|80:LinearComp|3/3|
|MT_64|TestU01 (BigCrush)|81:LinearComp|3/3|

Here's the new systematic failures table:
||RNG identifier||Test suite||Systematic failures||Fails||
|MT_64|TestU01 (BigCrush)|80:LinearComp|9/9|
|MT_64|TestU01 (BigCrush)|81:LinearComp|9/9|
|TWO_CMRES|Dieharder|diehard_dna:0|9/9|

So as stated previously the TWO_CMRES is now failing a Dieharder test that is 
marked as suspect. No other RNGs have changed to systematic failure.

FYI for the {{IntProviders}}:
||RNG identifier||Test suite||Systematic failures||Fails||
|JDK|Dieharder|diehard_oqso:0|9/9|
|JDK|Dieharder|diehard_dna:0|9/9|
|JDK|Dieharder|rgb_minimum_distance:3|9/9|
|JDK|Dieharder|rgb_minimum_distance:4|9/9|
|JDK|Dieharder|rgb_minimum_distance:5|9/9|
|JDK|Dieharder|rgb_lagged_sum:15|9/9|
|JDK|Dieharder|rgb_lagged_sum:31|9/9|
|JDK|Dieharder|dab_bytedistrib:0|9/9|
|JDK|Dieharder|dab_filltree:32|9/9|
|JDK|TestU01 (BigCrush)|1:SerialOver|9/9|
|JDK|TestU01 (BigCrush)|3:CollisionOver|9/9|
|JDK|TestU01 (BigCrush)|5:CollisionOver|9/9|
|JDK|TestU01 (BigCrush)|7:CollisionOver|9/9|
|JDK|TestU01 (BigCrush)|9:CollisionOver|9/9|
|JDK|TestU01 (BigCrush)|11:CollisionOver|9/9|
|JDK|TestU01 (BigCrush)|13:BirthdaySpacings|9/9|
|JDK|TestU01 (BigCrush)|14:BirthdaySpacings|9/9|
|JDK|TestU01 (BigCrush)|15:BirthdaySpacings|9/9|
|JDK|TestU01 (BigCrush)|16:BirthdaySpacings|9/9|
|JDK|TestU01 (BigCrush)|17:BirthdaySpacings|9/9|
|JDK|TestU01 (BigCrush)|18:BirthdaySpacings|9/9|
|JDK|TestU01 (BigCrush)|19:BirthdaySpacings|9/9|
|JDK|TestU01 (BigCrush)|20:BirthdaySpacings|9/9|
|JDK|TestU01 (BigCrush)|21:BirthdaySpacings|9/9|
|JDK|TestU01 (BigCrush)|22:ClosePairs|9/9|
|JDK|TestU01 (BigCrush)|23:ClosePairs|9/9|
|JDK|TestU01 (BigCrush)|24:ClosePairs|9/9|
|JDK|TestU01 (BigCrush)|25:ClosePairs|9/9|
|JDK|TestU01 (BigCrush)|26:SimpPoker|9/9|
|JDK|TestU01 (BigCrush)|28:SimpPoker|9/9|
|JDK|TestU01 (BigCrush)|30:CouponCollector|9/9|
|JDK|TestU01 (BigCrush)|31:CouponCollector|9/9|
|JDK|TestU01 (BigCrush)|34:Gap|9/9|
|JDK|TestU01 (BigCrush)|36:Gap|9/9|
|JDK|TestU01 (BigCrush)|40:Permutation|9/9|
|JDK|TestU01 (BigCrush)|41:Permutation|9/9|
|JDK|TestU01 (BigCrush)|42:Permutation|9/9|
|JDK|TestU01 (BigCrush)|43:Permutation|9/9|
|JDK|TestU01 (BigCrush)|44:CollisionPermut|9/9|
|JDK|TestU01 (BigCrush)|45:CollisionPermut|9/9|
|JDK|TestU01 (BigCrush)|46:MaxOft|9/9|
|JDK|TestU01 (BigCrush)|47:MaxOft|9/9|
|JDK|TestU01 (BigCrush)|48:MaxOft|9/9|
|JDK|TestU01 (BigCrush)|49:MaxOft|9/9|
|JDK|TestU01 (BigCrush)|50:SampleProd|9/9|
|JDK|TestU01 (BigCrush)|51:SampleProd|9/9|
|JDK|TestU01 (BigCrush)|52:SampleProd|9/9|
|JDK|TestU01 (BigCrush)|57:AppearanceSpacings|9/9|
|JDK|TestU01 (BigCrush)|59:WeightDistrib|9/9|
|JDK|TestU01 (BigCrush)|62:WeightDistrib|9/9|
|JDK|TestU01 (BigCrush)|63:WeightDistrib|9/9|
|JDK|TestU01 (BigCrush)|65:SumCollector|9/9|
|JDK|TestU01 (BigCrush)|74:RandomWalk1|9/9|
|JDK|TestU01 (BigCrush)|86:LongestHeadRun|9/9|
|JDK|TestU01 (BigCrush)|90:HammingWeight2|9/9|
|JDK|TestU01 (BigCrush)|95:HammingIndep|9/9|
|JDK|TestU01 (BigCrush)|97:HammingIndep|9/9|
|JDK|TestU01 (BigCrush)|99:HammingIndep|9/9|
|JDK|TestU01 (BigCrush)|101:Run|9/9|
|MT|TestU01 (BigCrush)|80:LinearComp|9/9|
|MT|TestU01 (BigCrush)|81:LinearComp|9/9|
|WELL_512_A|TestU01 (BigCrush)|68:MatrixRank|9/9|
|WELL_512_A|TestU01 (BigCrush)|69:MatrixRank|9/9|
|WELL_512_A|TestU01 (BigCrush)|70:MatrixRank|9/9|
|WELL_512_A|TestU01 (BigCrush)|71:MatrixRank|9/9|
|WELL_512_A|TestU01 (BigCrush)|80:LinearComp|9/9|
|WELL_512_A|TestU01 (BigCrush)|81:LinearComp|9/9|
|WELL_1024_A|TestU01 (BigCrush)|70:MatrixRank|9/9|
|WELL_1024_A|TestU01 (BigCrush)|71:MatrixRank|9/9|
|WELL_1024_A|TestU01 (BigCrush)|80:LinearComp|9/9|
|WELL_1024_A|TestU01 (BigCrush)|81:LinearComp|9/9|
|WELL_19937_A|TestU01 (BigCrush)|80:LinearComp|9/9|
|WELL_19937_A|TestU01 (BigCrush)|81:LinearComp|9/9|
|WELL_19937_C|TestU01 (BigCrush)|80:LinearComp|9/9|
|WELL_19937_C|TestU01 (BigCrush)|81:LinearComp|9/9|
|WELL_44497_A|TestU01 (BigCrush)|80:LinearComp|9/9|
|WELL_44497_A|TestU01 (BigCrush)|81:LinearComp|9/9|
|WELL_44497_B|TestU01 (BigCrush)|80:LinearComp|9/9|
|WELL_44497_B|TestU01 (BigCrush)|81:LinearComp|9/9|

So the JDK version is broken and the MT and Well variants fail LinearComplexity 
and small Well variants fail MatrixRank.

Here's the results table across 9 runs ignoring dieharder_sums:
||RNG identifier||Dieharder||TestU01 (BigCrush)||
|JDK|14 12 14 14 16 15 11 12 13|78 75 76 74 72 75 75 76 75|
|MT|0 0 0 0 0 0 0 0 0|2 2 2 3 2 2 4 2 2|
|WELL_512_A|0 0 0 0 0 0 0 0 0|6 6 6 7 6 6 6 6 6|
|WELL_1024_A|0 0 0 0 0 0 0 0 0|4 4 5 4 4 5 5 4 4|
|WELL_19937_A|0 0 0 0 0 0 0 0 0|2 3 4 3 2 2 2 3 2|
|WELL_19937_C|0 0 0 0 0 0 0 0 0|2 2 2 2 2 3 2 2 2|
|WELL_44497_A|0 0 0 0 0 0 0 0 0|3 3 2 2 3 3 2 2 2|
|WELL_44497_B|0 0 0 0 0 0 0 0 0|2 2 3 2 2 2 2 2 3|
|ISAAC|0 0 1 0 0 0 0 0 0|1 0 0 0 1 0 1 0 0|
|MT_64|0 0 0 0 0 1 0 0 0|3 4 3 4 2 3 3 2 3|
|SPLIT_MIX_64|0 0 0 0 0 0 0 0 0|0 1 0 0 0 0 2 0 0|
|XOR_SHIFT_1024_S|0 0 0 0 0 0 0 0 0|0 0 0 0 1 1 2 0 0|
|TWO_CMRES|1 1 1 1 1 1 1 1 1|0 0 1 2 0 1 0 0 1|
|MWC_256|0 0 0 0 0 0 0 0 0|0 0 0 0 0 0 0 0 0|
|KISS|0 0 0 0 0 0 0 0 0|0 1 0 1 2 0 0 0 1|

Here's the old {{LongProviders}} for reference:
||RNG identifier||Dieharder||TestU01 (BigCrush)||
|MT_64|0 0 0|2 2 3|
|SPLIT_MIX_64|0 0 0|0 0 0|
|XOR_SHIFT_1024_S|0 0 0|0 0 0|
|TWO_CMRES|0 1 0|0 0 0|

So of the {{LongProviders}}
 * SPLIT_MIX and XOR_SHIFT_1024_S fail a few BigCrush tests at random
 * MT_64 systematically fails 2 LinearComplexity tests and few others at random
 * TWO_CMRES systematically fails dieharder_dna and a few BigCrush at random

I have checked for a lower threshold for systematic failure, e.g. >20% failure 
rate (2/9) on a given test, and the results are the same except for the JDK and 
MT_64. For MT_64 there are 2/9 failures for BigCrush 77:RandomWalk1.

This means that the others failures that are present are due to random seeding 
as they never occur on the same test more than once.

> CachedUniformRandomProvider for nextBoolean() and nextInt()
> -----------------------------------------------------------
>
>                 Key: RNG-57
>                 URL: https://issues.apache.org/jira/browse/RNG-57
>             Project: Commons RNG
>          Issue Type: Improvement
>          Components: core
>            Reporter: Alex D Herbert
>            Priority: Minor
>              Labels: performance
>             Fix For: 1.2
>
>
> Implement a wrapper around a {{UniformRandomProvider}} that can cache the 
> underlying source of random bytes for use in the methods {{nextBoolean()}} 
> and {{nextInt()}} (in the case of {{LongProvider}}). E.g.
> {code:java}
> LongProvider provider = RandomSource.create(RandomSource.SPLIT_MIX_64);
> CachedLongProvider rng = new CachedLongProvider(provider);
> // Uses cached nextLong() 64 times
> rng.nextBoolean();
> // Uses cached nextLong() twice
> rng.nextInt();
> IntProvider provider = RandomSource.create(RandomSource.KISS);
> CachedIntProvider rng2 = new CachedIntProvider(provider);
> // Uses cached nextInt() 32 times
> rng2.nextBoolean();
> // This could be wrapped by a factory method:
> UniformRandomProvider rng = CachedUniformRandomProviderFactory.wrap(
>         // Any supported source: IntProvider or LongProvider
>         RandomSource.create(RandomSource...));
> {code}
> The implementation should be speed tested to determine the benefit for 
> {{nextBoolean()}} and if {{nextInt()}} can be improved for {{LongProviders}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to