Teemu Ikonen <tpiko...@gmail.com> writes: > Does anyone here have good ideas on how to ensure reproducibility in > the long term?
Regression testing, as mentioned, or running some fixed analysis and statistically comparing the results to past runs. We worry about reproducibility in my field of particle physics. We run on many different Linux and Mac platforms and strive for statistical consistency (see below) not identical consistency. I don't recall there ever being an issue with different versions of, say, Debian system libraries. Any inconsistencies we have found have been due to version shear in different copies of our own codes. [Aside: I have seen gross differences between Debian and RH-derived platforms. In a past experiment I was the only collaborator working on Debian and almost everyone else was using Scientific Linux (RHEL derivative). I kept getting bit by our code crashing on me. It seems, for some reason, my compilations tended to put garbage in uninitialized pointers where on SL they tended to get NULL. So, I was the lucky one to find and fix a lot of programming mistakes. This could have just been a fluke, I have no explanation for it.] > The only thing that comes to my mind is to run all > important calculations in a virtual machine image which is then signed > and stored in case the results need verification. But, maybe there are > other options? We have found that running the exact same code and same Debian OS on differing CPUs will lead to differing results. They differ because IEEE FP "standard" isn't implemented exactly the same on all CPUs. The results will differ in only the least significant digits. But, if you use simulations that consume random numbers and compare them against FP values this can lead to more gross divergences. However, with a large enough sample the results are all statistically consistent. I don't know how that translates when using virtual machines on different host CPUs, but if you care about bit-for-bit identically, this FP "standard" may percolate up through the VM and ruin that. Anyways, in the end, all CPUs give the "wrong" results since FP calculations are not infinitely precise, so striving for bit-for-bit consistency is kind of a pipe dream. -Brett.
smime.p7s
Description: S/MIME cryptographic signature