On Dec 12, 2011, at 7:11 AM, Sean Owen wrote: > On Mon, Dec 12, 2011 at 11:59 AM, Grant Ingersoll <[email protected]>wrote: > >> In Lucene, we simply print out what the seed is if the tests fail and then >> you can rerun that test by saying ant -Dtestseed=XXXX test >> > > I like that -- it's a separate thing but it's a fine idea too. It lets you > at least try different seeds while not sacrificing repeatability. > > > >> AFAICT, the issue with some of these tests, LogLikelihoodTest (the >> frequency comparison test fails when you change the seed) is that they are >> testing specific values as outcomes based on a specific test seed. That's >> OK, but it doesn't lend itself to the reset problem and it makes things >> problematic to debug should we ever change the seed, etc. >> > > Yes that's right, and so it only works if it sees the same random number > sequence. Even if it didn't we'd still want repeatability, though ideally > it would work with all random seeds so that we can use different seeds. > > Assertions like this are good for detecting changing behavior, but yes not > strictly asserting something that must not be false. > > The easiest solution is to just remove assertions that seem to be testing > the random number sequence; a slightly better idea would be to relax them > to assert things that must be true (e.g. count must be positive and less > than the number of items or something). >
+1. We do seem to be testing the RNG in some of these tests a bit more than we are testing the actual thing. > Better still would be to come up with assertions that ought to be true with > very high probability, such that a failure is almost certainly a symptom of > a bug. It's hard to write those bounds; you could guess conservatively and > then relax as you observe failures that don't appear to be a bug. This is a > fair bit of work though! > I'm not sure if it is completely valid, but it seems to me that if our tests can't run concurrently, it also raises a doubt as to whether some of our classes can be run concurrently. >> >> This is where Ant seems to be a lot better. > > > Right, it would be fairly easy to fork JVMs in Ant. I imagine there's some > way to do it Maven, but how I don't know. > > Hmm, is there any easy way to have it run tests in all of the modules, > independently and in parallel, in n JVMs? That would do it too. > Or, say we had a separate test src dir for "big" tests -- can that > secondary test target be told to run in parallel? > I suppose you could write an Ant target to do the work and then invoke it > in Maven. I think if we adopt the Lucene test framework (it's a separate JAR), we can get both the -Dtestseed thing and annotations such as @slow, @nightly, @weekly, etc. There are probably other useful nuggets in there too. It should work with Maven.
