> On 16 Mar 2019, at 02:54, Gilles Sadowski <gillese...@gmail.com> wrote:
>> This is read by dieharder which directly reads from stdin. This worked to 
>> collect all the generated bits and the serial and xor composites failed the 
>> test suite.
>> 
>> It is also read by the stdin2testu01.c program to pass to TestU01.
>> 
>> What is happening is that the stdin2testu01.c is reading 64-bits using an 
>> unsigned long.
> 
> I don't remember why I wrote that, but as you pointed outit now looks
> like a plain bug.

It may be more complicated again...

I’ve had a play around with the data being pushed through to the testU01 
library using the c bridge. I wanted to check that the int value that is 
generated by the RNG is passed through to the c program. So I wrote a simple 
BridgeTester class to do this. It writes all the int values to a data file (for 
reference) then passes them to the c executable with the same method as the 
RandomStressTester. I then modified the stdin2testu01.c program to have an 
extra hidden debug mode where all the data is just written to stdout.

I found the data file written from Java did not match the data that the c 
program had. I bit more digging found that the problem was that Java uses a big 
endian representation and the c program is little endian. This is true on my 
linux and Mac OSX platforms. So the raw bytes read from stdin are in the wrong 
order.

When I updated the program to self detect endianness and swap the byte order of 
each set of 4 bytes from the stdin then the data in the c program matched the 
original.

Since it was non destructive to the module I added all this to master. You can 
see this working by rebuilding the c bridge and running the new profile to test 
it:

> cd commons-rng-examples/examples-stress
> gcc src/main/c/stdin2testu01.c -o stdin2testu01 -ltestu01 -ltestu01probdist 
> -ltestu01mylib -lm
> mvn test -P bridge

You should see two files:

target/bridge.data
target/bridge.out

These should have the same contents. The .data file is written by the java 
program, and the .out file is the stdout captured from the c program with its 
view of the data.

This should fix running TestU01.

BUT I’ve not had time to determine how Dieharder is reading the stdin. Given it 
is a c library it may be reading it using little endian as well. I’ll look into 
that next.

Composite update:

For some reason all my BigCrush simulations crashed. It could be a RAM issue. 
The runs did take longer than expected but I did not monitor memory usage. I’ve 
started them again but using only the serial composite. I think the xor one is 
really broken.

FYI. Using the new bridge code with 3 runs of SmallCrush finds [6, 6, 6] / 15 
failed tested for the serial composite and [9, 9, 10] / 15 for the xor 
composite.

I’m expecting BigCrush to fail a lot. I’m now more interested in seeing if it 
will complete.

Alex




Reply via email to