On 21/03/2019 15:13, Gilles Sadowski wrote:
[...]
If the user forgets to supply one, the program outputs one, and stops;
then the user reissues the command?
Yes:

   > java -jar examples-stress.jar -h

Print something helpful

   > java -jar examples-stress.jar --template

Print a template generators list to stdout

   > java -jar examples-stress.jar --template > list.txt

   > java -jar examples-stress.jar target/tu_ 4 list.txt BE
./stdin2testu01 BigCrush


I've used picocli before. It definitely needs very little extra code due
to the use of annotations.

One thing I do not know is what happens to the arguments for the stress
test program, e.g.

/usr/bin/dieharder -a -g 200 -Y 1 -k 2

If they match anything used by the examples-stress.jar program then they
will be consumed by a parser. If options match arguments to be passed to
the stress test program then the executable program would have to be put
into a script. For now we can choose the arguments to not clash. Should
be simple given we avoid these:

./stdin2testu01 BigCrush

/usr/bin/dieharder -a -g 200 -Y 1 -k 2

So:

-h, --help => help

--template => print a template

I would leave these as mandatory as they are all important to not forget:

    * output file prefix
    * int threads
    * generators list
    * endianness (an enum of BE or LE)
    * application
    * application arguments

For picocli that would be:

@Parameters(index = "0")    File prefix;
@Parameters(index = "1")    int threadCount;
@Parameters(index = "2")    File generatorsList;
@Parameters(index = "3")    Endianness endianness;
@Parameters(index = "4")    File executable;
@Parameters(index = "5..*") String[] executableArguments;


So it is very simple. I will make modifications to the updated program
to use Picocli.
I'd suggest to change the program usage to make it more flexible, i.e.

$ java -jar RandomStressTester.jar --prefix prefix --threads 4 --tasks
genlist --byteorder BE -- /usr/bin/dieharder -a -g 200 -Y 1 -k 2

Thus, everything tha follows the double-dash is the command-line
for the "ProcessBuilder".

And there can be default values for
   * prefix (possibly aborting if the targets already exist)
   * threads
   * tasks (a file provided in the JAR (?))
   * byteorder ("LE")

So, this could work too:

$ java -jar RandomStressTester.jar -- /usr/bin/dieharder -a -g 200 -Y 1 -k 2
OK. So we provide defaults for everything. I've just found this snippet
to determine the endianness:

|importjava.nio.ByteOrder;if(ByteOrder.nativeOrder().equals(ByteOrder.BIG_ENDIAN)){System.out.println("Big-endian");}else{System.out.println("Little-endian");}|

The JDK will throw an Error if the call to nativeOrder() is not a
recognised form. E.g. a weird mixed/middle-endian platform.

Picocli even supports setting the ByteOrder value using "BIG_ENDIAN" or
"LITTLE_ENDIAN".


Why the '--'? Seems mute if using a library to parse the arguments.
https://picocli.info/#_double_dash_code_code

Point is: Why parse option if the whole bunch is pass to some other
class that will perform its own processing?

That is nice.

I missed that in the picocli manual.


Without it the bare command line arguments are all used for the test suite.
In the end, it does not matter but it delineates what is used
by the Java program and what is used by the "sub-process".

For reference here are the results of BigCrush with:

The correct little-endian byte order:

XorShiftXorComposite : 54, 53, 53 : 646.8 +/- 10.9
XorShiftSerialComposite : 40, 39, 39 : 608.2 +/- 3.9
SplitXorComposite : 0, 0, 0 : 625.8 +/- 0.2

The incorrect big-endian byte order:

XorShiftXorComposite : 92, 89, 90 : 986.7 +/- 4.3
XorShiftSerialComposite : 75, 74, 76 : 632.0 +/- 2.3

(I did not run the control.)

This makes a fair bit of difference as it did for dieharder. So the byte
order is important to get correct. I.e. you are not testing the true
output of the generator if the bytes are reversed.
I wonder whether this is the consequence of correlations.
IOW: Would order matter for a good generator?
I don't know. But when the benchmark gets rerun a drop in the number of
failures would give evidence that the byte order is important. This may
be best observed with the JDK generator as it has a long way to fall.
For a bad generator, you've shown that it is important, but since the
current table has entries with 0 failures, it means that it isn't for a
good generator.  Or am I missing something?

I agree with you.

If each bit is totally random and unrelated to any other bit then reversing the bytes (or even bits) will make no difference to the randomness.

So the effect of endianness is due to the correlated sequences.

OK. Peeked my interest a bit so for example these systematically fail when the bytes are reversed:

sknuth_Run: "measures the lengths of subsequences of successive values in [...] increasing (or decreasing) order"

sknuth_Permutation uses ordering.

sknuth_MaxOft uses the maximum number.


svaria_SampleProd uses products of nonoverlapping successive groups of t values.

(and products use magnitude)


sstring_HammingCorr: Applies a correlation test on the Hamming weights of successive blocks of L bits


So reversing the bytes, which effects the interpretation of the number and the bit order, makes tests fail that use the order or magnitude of bits.


Without wanting to read too much into this it may be because the XorShift1024Star algorithm has a weakness in the lowest few bits. This is what the XorShift1024Star -> XorShift1024StarPhi change is meant to correct. Make one of the lowest bits more random.

When the bytes are reversed these poor bits become the most significant byte. This makes the magnitude of the 32-bit number less random than when the poor bits are least significant.

This is not detected for the XorShift1024Star on its own. But when the sequence is correlated by using the combined generator this change causes tests to fail.


Alex



[But, for sure: better report the correct number of failures for the bad
generators too.]

Regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to