As a point of reference, if I comment out the reset() code in useTestSeed for the math package, all tests pass w/ parallel execution and fork once. Of course, that's just one piece.
I guess I don't understand why we need to do all that reset stuff there anyway. If you are using the test seed, it gets marked in the @Before (and probably should be @BeforeClass) and then that test should have all of it's RandomWrapper's set to the test seed going forward. Why do we need to track all the other randoms that were ever created and reset them? This seems to have been introduced on 2/19/10 by: <snip> Add several more unit tests for cf.taste. Make random numbers all but guaranteed to be deterministic during unit tests, to allow for repeatable tests of components with randomness. git-svn-id: https://svn.apache.org/repos/asf/lucene/mahout/trunk@911810 13f79535-47bb-0310-9956-ffa450edef68 </snip> -Grant On Dec 11, 2011, at 6:08 AM, Grant Ingersoll wrote: > In working through what I _think_ will be the primary viable way to make this > stuff faster (parallel execution, fork once) it appears to me that the > primary concurrency issue is due to how we initialize the test seed and the > fact that we loop over all RandomWrapper objects and reset them. So, it's > likely the case that in mid stream of some of the tests, the RNG is getting > reset by other calls to the static useTestSeed() method. > > Of course, there might be other concurrency issues beyond that, but this > seems like the most likely one to start. Thus, the question is how to fix > it. The obvious one, I suppose, is to not use statics for this stuff. > Another is, to perhaps, use a system property (-DuseTestSeed=true and/or > -DuseSeed=<SEED>, the latter being useful for debugging other things) that is > set upon invocation in the test plugin, but has the downside that it would > also need to be set when running from an IDE. > > And, to Sean's point below, it seems that we may have some test dependencies > on the specific set of random numbers and the outcomes they produce. > > Thoughts? Other ideas? > > > On Dec 8, 2011, at 1:05 PM, Grant Ingersoll wrote: > >> Progress! I had configured the surefire plugin in the wrong place >> >> >> On Dec 8, 2011, at 2:55 PM, Sean Owen wrote: >> >>> This could well be it. While every Random everywhere gets initialized to a >>> known initial state, at the start of every @Test method, you could get >>> different sequences if other tests are in progress in parallel in the same >>> JVM. >>> >>> Ideally tests aren't that sensitive to the sequence of random numbers -- if >>> that's the case. And here it may well be the case. >>> >>> Can this be set to fork a JVM per test class? that would probably work. >>> >>> On Thu, Dec 8, 2011 at 7:43 PM, Grant Ingersoll <gsing...@apache.org> wrote: >>> >>>> >>>> On Dec 8, 2011, at 2:39 PM, Grant Ingersoll wrote: >>>> >>>>> >>>>> On Dec 8, 2011, at 2:23 PM, Grant Ingersoll wrote: >>>>> >>>>>> If I add parallel, fork always to the main surefire config, I get >>>> failures all over the place for things like: >>>>>> Failed tests: >>>> testHebbianSolver(org.apache.mahout.math.decomposer.hebbian.TestHebbianSolver): >>>> Error: {0.06146049974880152 too high! (for eigen 3) >>>>>> consistency(org.apache.mahout.math.jet.random.NormalTest): >>>> offset=0.000 scale=1.000 Z = 8.2 >>>>>> consistency(org.apache.mahout.math.jet.random.ExponentialTest): >>>> offset=0.000 scale=100.000 Z = 8.7 >>>>>> >>>>> >>>>> Check that, it seems each run can produce different failures, which >>>> leads me to believe we have some shared values in our tests >>>> >>>> Random.getRandom() the culprit, perhaps? >>>> >>>>> >>>>> >>>>>> All of these pass individually and when not in parallel for me. >>>>>> >>>>>> Here's my config: >>>>>> <plugin> >>>>>> <groupId>org.apache.maven.plugins</groupId> >>>>>> <artifactId>maven-surefire-plugin</artifactId> >>>>>> <version>2.11</version> >>>>>> <configuration> >>>>>> <parallel>classes</parallel> >>>>>> <forkMode>always</forkMode> >>>>>> <perCoreThreadCount>true</perCoreThreadCount> >>>>>> </configuration> >>>>>> </plugin> >>>>>> >>>>>> Anyone else seeing that? >>>>>> >>>>>> >>>>>> On Dec 8, 2011, at 1:53 PM, Dmitriy Lyubimov wrote: >>>>>> >>>>>>> SSVD actually runs a rather small test but it is a MR job in local >>>>>>> mode, there's nothing to cut down there in terms of size (not much >>>>>>> anyway). It's just what it takes to initialize and run all jobs (and >>>>>>> since it is local, it is also single threaded, so it actually runs V >>>>>>> and U jobs sequentially instead of parallel so it's even longer >>>>>>> because of that (4 jobs stringed all in all). >>>>>>> >>>>>>> But i will take a look, although even if i reduce solution size, it >>>>>>> will still likely not reduce running time by more than 20%. >>>>>>> >>>>>>> On Thu, Dec 8, 2011 at 5:42 AM, David Murgatroyd <dmu...@gmail.com> >>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Dec 8, 2011, at 8:36 AM, Grant Ingersoll <gsing...@apache.org> >>>> wrote: >>>>>>>> >>>>>>>>> MAHOUT-916 and 917 are attempts to address the running time of our >>>> tests. As Sean rightfully pointed out, there are probably opportunities to >>>> simply cut down the sizes of some of these tests w/o effecting there >>>> correctness. To that end, if people can take a look at: >>>>>>>>> https://builds.apache.org/job/Mahout-Quality/1237/testReport/junit/ >>>>>>>>> >>>>>>>>> You can get a sense as to which tests are taking a long time. The >>>> main culprits are: >>>>>>>>> 1. Vectorizer >>>>>>>>> 2. SSVD >>>>>>>>> 3. K-Means >>>>>>>>> 4. taste.hadoop.item >>>>>>>>> 5. taste.hadoop.als >>>>>>>>> 6. PFPGrowth >>>>>>>>> >>>>>>>>> >>>>>>>>> -Grant >>>>>>>>> >>>>>>>>> -------------------------------------------- >>>>>>>>> Grant Ingersoll >>>>>>>>> http://www.lucidimagination.com >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>> >>>>>> -------------------------------------------- >>>>>> Grant Ingersoll >>>>>> http://www.lucidimagination.com >>>>>> >>>>>> >>>>>> >>>>> >>>>> -------------------------------------------- >>>>> Grant Ingersoll >>>>> http://www.lucidimagination.com >>>>> >>>>> >>>>> >>>> >>>> -------------------------------------------- >>>> Grant Ingersoll >>>> http://www.lucidimagination.com >>>> >>>> >>>> >>>> >> >> -------------------------------------------- >> Grant Ingersoll >> http://www.lucidimagination.com >> >> >> > > -------------------------------------------- > Grant Ingersoll > http://www.lucidimagination.com > > > -------------------------------------------- Grant Ingersoll http://www.lucidimagination.com