Le lun. 18 mars 2019 à 16:49, Alex Herbert <alex.d.herb...@gmail.com> a écrit : > > > On 18/03/2019 14:12, Gilles Sadowski wrote: > > Hi. > > > >>>> [...] > >>>> > >>>> One actual issue is that we are testing long providers using the long to > >>>> create 2 int values. Should we test using a series of the upper 32 bits > >>>> and then a series of the lower 32 bits? > >>> Is that useful since the test now sees the integers as they are produced > >>> (i.e. 2 > >>> values per long)? > >>> > >> It is not relevant if you are concerned about int quality. But if you are > >> concerned about long quality then it is relevant. The long quality is > >> important for the quality of nextDouble(). Although in that case only the > >> upper 53 bits of the long. This means that the quality of a long from an > >> int provider is also not covered by the benchmark as that would require > >> testing alternating ints twice using the series: 1, 3, 5…, 2n+1 and 2, 4, > >> 6, … 2n. > > I don't follow: I'd think that if the full sequence passes the test, > > then "decimated" > > sequences will too. > > My position was that if a series of int values is random, that does not > mean a subset of the int values is random due to bias in the subset sample.
Doesn't this statement contradict... > > However I acknowledge that: > > - the test suites may have this covered already > > - if it really is random then any subset will also be random, even if it > is a systematic subset such as alternating values ... this one? > > >> Given that half of the int values were previously discarded from the > >> BigCrush analysis, the current results on the user guide page actually > >> represent BigCrush running on the upper 32-bits of the long, byte reversed > >> due to the big/little endian interpretation of the bytes in Java and linux. > >> > >> So maybe the an update to the RandomStressTester to support analysis for > >> int or long quality is needed. > > I'm not convinced. > > I'm not totally convinced either. It is a lot more work to test upper > and lower bits separately. > > It may be that a producer of long values has better randomness in the > upper bits. Or put another way has less than 64-bits of randomness. > > The question is whether running the test suite on all the bits (as we > currently do) or targetting just the upper or lower 32-bits is useful. > E.g. would a RNG that fails a few tests using all the bits pass with > just the upper 32-bits, Since the RNG outputs all the bits, passing the test with part of them would have no particular value (except if the goal is to create a new implementation based on that observation). > and fail more with just the lower 32-bits, or > would the fails be the same? > > Note: The current results for long providers do not test the lower > 32-bits at all, and currently test alternating values from any int > providers. So they will have to be rerun anyway. +1 [For the sake of consistency, but I don't think that the results will be different.] > Previously I looked at systematic failures in the test suite (where the > same test always fails). IIRC the MersenneTwister has some systematic > failures. Since we are not doing systematic failure analysis for the > user guide, and we are not developing the algorithms, then I agree that > a more detailed analysis of the failures and their origins is beyond the > scope of the quality section. We already provide a wider choice of good implementations than any other Java library; indeed, I think that time is better spent in porting more algorithms (even "bad" ones). > > So leave the testing to just ints and document on the user guide that is > what we are testing. +1 Gilles > > >> For now the quality section on the website should just state that the > >> quality is for the ‘nextInt()’ method of the RNG. > >> > >> I have the results of BigCrush using the new bridge c program: > >> > >> XorShiftSerialComposite : 40, 39, 39 : 608.2 +/- 3.9 > > Makes sense now. :-} > > > >> So it fails. > >> > >> The XorShiftXorComposite crashed after 2 hours about 1/4 of the results > >> file complete. I am running again so I can monitor it for memory usage. > >> Something in the BigCrush suite just cannot handle this generator output. > > Strange... > > Yep. I restarted it and it crashed after 3 hours again! Monitoring every > minute found no obvious memory issues. The BigCrush process never > exceeded 2.7% of memory and Java never exceeded 0.1%. > > The footer is written by the Java program so this indicates that the > TestU01 bridge is stopping then the Java process writes the footer, > wraps everything up and stops. > > Weirdly my process to follow the output also stopped which is > unexpected. I am investigating if my system has some strange walltime > limits I do not know about. Since the other composite generators work > using the same code I am thinking it may be a bug in TestU01 when the > generator is bad (which DieHarder thinks it definitely is). But I am > prepared to be wrong on that and also to never find out. > > I've changed RandomStressTester to redirect the stderr to stdout (in > case that contains any info) and added a line to get the exit code from > the Java Process that is running BigCrush. Maybe that will be non-zero. > So I re-run and wait 3+ hours again... > > Alex > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org