[ https://issues.apache.org/jira/browse/RNG-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16695893#comment-16695893 ]
Alex D Herbert commented on RNG-57: ----------------------------------- {quote}I'm not sure: What ensures that the previous implementation to retrieve a boolean from the underlying RNG is better? {quote} I'd not considered this. The only way to know would be to test it. This could be done by targeting the specific tests for RandomWalk within BigCrush using only the first bit (i.e. dropping all the other bits). This is possible using the C library API. Or build a RNG that composes an {{int}} using 32 calls to the underlying source using the upper bit from each source. This could be tested using the existing framework with all the test suites. It is simpler to just advise against using poor RNGs and stick to those with low errors from BigCrush. For systematic errors I have ignored dieharder_sums. I store a failure in each occurrence of the same test. For BigCrush the test has a test Id. This is the same even though the test may have multiple parts, e.g. this is one test failure: {noformat} 74 RandomWalk1 H (L=50, r=0) eps 74 RandomWalk1 M (L=50, r=0) eps 74 RandomWalk1 J (L=50, r=0) eps 74 RandomWalk1 R (L=50, r=0) eps 74 RandomWalk1 C (L=50, r=0) eps {noformat} For Dieharder the test has a name then a parameter (ntup), e.g. this is two tests: {noformat} rgb_lagged_sum| 14| 1000000| 100|0.95188466| PASSED rgb_lagged_sum| 15| 1000000| 100|0.00000000| FAILED {noformat} Here's the old systematic failures table using the original txt results files for the {{LongProviders}}: ||RNG identifier||Test suite||Systematic failures||Fails|| |MT_64|TestU01 (BigCrush)|80:LinearComp|3/3| |MT_64|TestU01 (BigCrush)|81:LinearComp|3/3| Here's the new systematic failures table: ||RNG identifier||Test suite||Systematic failures||Fails|| |MT_64|TestU01 (BigCrush)|80:LinearComp|9/9| |MT_64|TestU01 (BigCrush)|81:LinearComp|9/9| |TWO_CMRES|Dieharder|diehard_dna:0|9/9| So as stated previously the TWO_CMRES is now failing a Dieharder test that is marked as suspect. No other RNGs have changed to systematic failure. FYI for the {{IntProviders}}: ||RNG identifier||Test suite||Systematic failures||Fails|| |JDK|Dieharder|diehard_oqso:0|9/9| |JDK|Dieharder|diehard_dna:0|9/9| |JDK|Dieharder|rgb_minimum_distance:3|9/9| |JDK|Dieharder|rgb_minimum_distance:4|9/9| |JDK|Dieharder|rgb_minimum_distance:5|9/9| |JDK|Dieharder|rgb_lagged_sum:15|9/9| |JDK|Dieharder|rgb_lagged_sum:31|9/9| |JDK|Dieharder|dab_bytedistrib:0|9/9| |JDK|Dieharder|dab_filltree:32|9/9| |JDK|TestU01 (BigCrush)|1:SerialOver|9/9| |JDK|TestU01 (BigCrush)|3:CollisionOver|9/9| |JDK|TestU01 (BigCrush)|5:CollisionOver|9/9| |JDK|TestU01 (BigCrush)|7:CollisionOver|9/9| |JDK|TestU01 (BigCrush)|9:CollisionOver|9/9| |JDK|TestU01 (BigCrush)|11:CollisionOver|9/9| |JDK|TestU01 (BigCrush)|13:BirthdaySpacings|9/9| |JDK|TestU01 (BigCrush)|14:BirthdaySpacings|9/9| |JDK|TestU01 (BigCrush)|15:BirthdaySpacings|9/9| |JDK|TestU01 (BigCrush)|16:BirthdaySpacings|9/9| |JDK|TestU01 (BigCrush)|17:BirthdaySpacings|9/9| |JDK|TestU01 (BigCrush)|18:BirthdaySpacings|9/9| |JDK|TestU01 (BigCrush)|19:BirthdaySpacings|9/9| |JDK|TestU01 (BigCrush)|20:BirthdaySpacings|9/9| |JDK|TestU01 (BigCrush)|21:BirthdaySpacings|9/9| |JDK|TestU01 (BigCrush)|22:ClosePairs|9/9| |JDK|TestU01 (BigCrush)|23:ClosePairs|9/9| |JDK|TestU01 (BigCrush)|24:ClosePairs|9/9| |JDK|TestU01 (BigCrush)|25:ClosePairs|9/9| |JDK|TestU01 (BigCrush)|26:SimpPoker|9/9| |JDK|TestU01 (BigCrush)|28:SimpPoker|9/9| |JDK|TestU01 (BigCrush)|30:CouponCollector|9/9| |JDK|TestU01 (BigCrush)|31:CouponCollector|9/9| |JDK|TestU01 (BigCrush)|34:Gap|9/9| |JDK|TestU01 (BigCrush)|36:Gap|9/9| |JDK|TestU01 (BigCrush)|40:Permutation|9/9| |JDK|TestU01 (BigCrush)|41:Permutation|9/9| |JDK|TestU01 (BigCrush)|42:Permutation|9/9| |JDK|TestU01 (BigCrush)|43:Permutation|9/9| |JDK|TestU01 (BigCrush)|44:CollisionPermut|9/9| |JDK|TestU01 (BigCrush)|45:CollisionPermut|9/9| |JDK|TestU01 (BigCrush)|46:MaxOft|9/9| |JDK|TestU01 (BigCrush)|47:MaxOft|9/9| |JDK|TestU01 (BigCrush)|48:MaxOft|9/9| |JDK|TestU01 (BigCrush)|49:MaxOft|9/9| |JDK|TestU01 (BigCrush)|50:SampleProd|9/9| |JDK|TestU01 (BigCrush)|51:SampleProd|9/9| |JDK|TestU01 (BigCrush)|52:SampleProd|9/9| |JDK|TestU01 (BigCrush)|57:AppearanceSpacings|9/9| |JDK|TestU01 (BigCrush)|59:WeightDistrib|9/9| |JDK|TestU01 (BigCrush)|62:WeightDistrib|9/9| |JDK|TestU01 (BigCrush)|63:WeightDistrib|9/9| |JDK|TestU01 (BigCrush)|65:SumCollector|9/9| |JDK|TestU01 (BigCrush)|74:RandomWalk1|9/9| |JDK|TestU01 (BigCrush)|86:LongestHeadRun|9/9| |JDK|TestU01 (BigCrush)|90:HammingWeight2|9/9| |JDK|TestU01 (BigCrush)|95:HammingIndep|9/9| |JDK|TestU01 (BigCrush)|97:HammingIndep|9/9| |JDK|TestU01 (BigCrush)|99:HammingIndep|9/9| |JDK|TestU01 (BigCrush)|101:Run|9/9| |MT|TestU01 (BigCrush)|80:LinearComp|9/9| |MT|TestU01 (BigCrush)|81:LinearComp|9/9| |WELL_512_A|TestU01 (BigCrush)|68:MatrixRank|9/9| |WELL_512_A|TestU01 (BigCrush)|69:MatrixRank|9/9| |WELL_512_A|TestU01 (BigCrush)|70:MatrixRank|9/9| |WELL_512_A|TestU01 (BigCrush)|71:MatrixRank|9/9| |WELL_512_A|TestU01 (BigCrush)|80:LinearComp|9/9| |WELL_512_A|TestU01 (BigCrush)|81:LinearComp|9/9| |WELL_1024_A|TestU01 (BigCrush)|70:MatrixRank|9/9| |WELL_1024_A|TestU01 (BigCrush)|71:MatrixRank|9/9| |WELL_1024_A|TestU01 (BigCrush)|80:LinearComp|9/9| |WELL_1024_A|TestU01 (BigCrush)|81:LinearComp|9/9| |WELL_19937_A|TestU01 (BigCrush)|80:LinearComp|9/9| |WELL_19937_A|TestU01 (BigCrush)|81:LinearComp|9/9| |WELL_19937_C|TestU01 (BigCrush)|80:LinearComp|9/9| |WELL_19937_C|TestU01 (BigCrush)|81:LinearComp|9/9| |WELL_44497_A|TestU01 (BigCrush)|80:LinearComp|9/9| |WELL_44497_A|TestU01 (BigCrush)|81:LinearComp|9/9| |WELL_44497_B|TestU01 (BigCrush)|80:LinearComp|9/9| |WELL_44497_B|TestU01 (BigCrush)|81:LinearComp|9/9| So the JDK version is broken and the MT and Well variants fail LinearComplexity and small Well variants fail MatrixRank. Here's the results table across 9 runs ignoring dieharder_sums: ||RNG identifier||Dieharder||TestU01 (BigCrush)|| |JDK|14 12 14 14 16 15 11 12 13|78 75 76 74 72 75 75 76 75| |MT|0 0 0 0 0 0 0 0 0|2 2 2 3 2 2 4 2 2| |WELL_512_A|0 0 0 0 0 0 0 0 0|6 6 6 7 6 6 6 6 6| |WELL_1024_A|0 0 0 0 0 0 0 0 0|4 4 5 4 4 5 5 4 4| |WELL_19937_A|0 0 0 0 0 0 0 0 0|2 3 4 3 2 2 2 3 2| |WELL_19937_C|0 0 0 0 0 0 0 0 0|2 2 2 2 2 3 2 2 2| |WELL_44497_A|0 0 0 0 0 0 0 0 0|3 3 2 2 3 3 2 2 2| |WELL_44497_B|0 0 0 0 0 0 0 0 0|2 2 3 2 2 2 2 2 3| |ISAAC|0 0 1 0 0 0 0 0 0|1 0 0 0 1 0 1 0 0| |MT_64|0 0 0 0 0 1 0 0 0|3 4 3 4 2 3 3 2 3| |SPLIT_MIX_64|0 0 0 0 0 0 0 0 0|0 1 0 0 0 0 2 0 0| |XOR_SHIFT_1024_S|0 0 0 0 0 0 0 0 0|0 0 0 0 1 1 2 0 0| |TWO_CMRES|1 1 1 1 1 1 1 1 1|0 0 1 2 0 1 0 0 1| |MWC_256|0 0 0 0 0 0 0 0 0|0 0 0 0 0 0 0 0 0| |KISS|0 0 0 0 0 0 0 0 0|0 1 0 1 2 0 0 0 1| Here's the old {{LongProviders}} for reference: ||RNG identifier||Dieharder||TestU01 (BigCrush)|| |MT_64|0 0 0|2 2 3| |SPLIT_MIX_64|0 0 0|0 0 0| |XOR_SHIFT_1024_S|0 0 0|0 0 0| |TWO_CMRES|0 1 0|0 0 0| So of the {{LongProviders}} * SPLIT_MIX and XOR_SHIFT_1024_S fail a few BigCrush tests at random * MT_64 systematically fails 2 LinearComplexity tests and few others at random * TWO_CMRES systematically fails dieharder_dna and a few BigCrush at random I have checked for a lower threshold for systematic failure, e.g. >20% failure rate (2/9) on a given test, and the results are the same except for the JDK and MT_64. For MT_64 there are 2/9 failures for BigCrush 77:RandomWalk1. This means that the others failures that are present are due to random seeding as they never occur on the same test more than once. > CachedUniformRandomProvider for nextBoolean() and nextInt() > ----------------------------------------------------------- > > Key: RNG-57 > URL: https://issues.apache.org/jira/browse/RNG-57 > Project: Commons RNG > Issue Type: Improvement > Components: core > Reporter: Alex D Herbert > Priority: Minor > Labels: performance > Fix For: 1.2 > > > Implement a wrapper around a {{UniformRandomProvider}} that can cache the > underlying source of random bytes for use in the methods {{nextBoolean()}} > and {{nextInt()}} (in the case of {{LongProvider}}). E.g. > {code:java} > LongProvider provider = RandomSource.create(RandomSource.SPLIT_MIX_64); > CachedLongProvider rng = new CachedLongProvider(provider); > // Uses cached nextLong() 64 times > rng.nextBoolean(); > // Uses cached nextLong() twice > rng.nextInt(); > IntProvider provider = RandomSource.create(RandomSource.KISS); > CachedIntProvider rng2 = new CachedIntProvider(provider); > // Uses cached nextInt() 32 times > rng2.nextBoolean(); > // This could be wrapped by a factory method: > UniformRandomProvider rng = CachedUniformRandomProviderFactory.wrap( > // Any supported source: IntProvider or LongProvider > RandomSource.create(RandomSource...)); > {code} > The implementation should be speed tested to determine the benefit for > {{nextBoolean()}} and if {{nextInt()}} can be improved for {{LongProviders}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)