Tidy, yes.  But better, no.

The Lucene project has made an art out of randomizing configurations for
tests.  Thus, the many thousands of people out there doing tests will all
be testing different combinations of things and when a failure happens,
that seed can be codified into the standard tests.

This is a bit different with some of the randomized tests in Mahout.  For
these, there is generally a (weak) statistical guarantee about the result.
 For instance, it might be that the test should succeed 99.9% of the time.
 To avoid spurious worries, after qualifying the test to fail no more than
expected, the seed is frozen so that things sit still.  Most of the errors
that we are after will trigger a hard failure so we don't lose much power
this way and still have stability.

A good example of this is a random number generator that is supposed to
sample from a particular distribution.  If you draw 10,000 deviates from
this generator, you can write a very simple test based on the cumulative
distribution function.  Simple that is except for the fact that to put
sharp bounds on the test will cause a non-negligible probability of failure
for a working version of the software.  On the other hand, putting loose
bounds will increase the likelihood that the test will succeed if somebody
breaks the code.  Increasing the number of samples makes the useful bounds
much tighter and allows lower probability of false success for bad code,
but it increases the test time.

There is little way around this Heisen-situation.  So we freeze the tests.

There are other types of tests where randomization doesn't change the
guarantees that the code makes whatsoever.  This often occurs in tinker-toy
software where you can plug together all kinds of components
interchangeably.  That is real different from the random number
distribution problem.

On Tue, Sep 4, 2012 at 12:26 AM, Sean Owen <[email protected]> wrote:

> I think this approach is even tidier than just recording the RNG seed
> for later reuse.
>

Reply via email to