Those of you interested in reproducibility might be interested in VisTrails. These is a start-up commercializing the software but most of it is free and development is open source, available from http://www.vistrails.org/index.php/Downloads. I remember that the software keeps track of the libraries, OS, and CPU that the code is using to get the results.
Best, António Rafael C. Paiva Post-doctoral fellow SCI Institute, University of Utah Salt Lake City, UT On Fri, Apr 30, 2010 at 8:51 AM, Brett Viren <b...@bnl.gov> wrote: > Teemu Ikonen <tpiko...@gmail.com> writes: > >> Does anyone here have good ideas on how to ensure reproducibility in >> the long term? > > Regression testing, as mentioned, or running some fixed analysis and > statistically comparing the results to past runs. > > We worry about reproducibility in my field of particle physics. We run > on many different Linux and Mac platforms and strive for statistical > consistency (see below) not identical consistency. I don't recall there > ever being an issue with different versions of, say, Debian system > libraries. Any inconsistencies we have found have been due to version > shear in different copies of our own codes. > > [Aside: I have seen gross differences between Debian and RH-derived > platforms. In a past experiment I was the only collaborator working on > Debian and almost everyone else was using Scientific Linux (RHEL > derivative). I kept getting bit by our code crashing on me. It seems, > for some reason, my compilations tended to put garbage in uninitialized > pointers where on SL they tended to get NULL. So, I was the lucky one > to find and fix a lot of programming mistakes. This could have just > been a fluke, I have no explanation for it.] > >> The only thing that comes to my mind is to run all >> important calculations in a virtual machine image which is then signed >> and stored in case the results need verification. But, maybe there are >> other options? > > We have found that running the exact same code and same Debian OS on > differing CPUs will lead to differing results. They differ because IEEE > FP "standard" isn't implemented exactly the same on all CPUs. The > results will differ in only the least significant digits. But, if you > use simulations that consume random numbers and compare them against FP > values this can lead to more gross divergences. However, with a large > enough sample the results are all statistically consistent. > > I don't know how that translates when using virtual machines on > different host CPUs, but if you care about bit-for-bit identically, this > FP "standard" may percolate up through the VM and ruin that. Anyways, > in the end, all CPUs give the "wrong" results since FP calculations are > not infinitely precise, so striving for bit-for-bit consistency is kind > of a pipe dream. > > > -Brett. > > -- To UNSUBSCRIBE, email to debian-science-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/o2v9f0a69bf1004300811rdc7039far1ba329a704526...@mail.gmail.com