Using forkEvery=1, my distributedTest run took 3h5m which is maybe 10 minutes slower than the random sampling I did of other tests and distributedTestCore took 2h35m which is on par with others.
I'd say that it's worth switching to forkEvery=1 as it will most likely provide more consistency for test results. --Jens On Thu, Dec 14, 2017 at 4:29 PM, Patrick Rhomberg <prhomb...@pivotal.io> wrote: > Are the AcceptanceTests currently set to fork after every one? It might be > apples to oranges, but we could try to look at the cost difference there. > > On Thu, Dec 14, 2017 at 4:24 PM, Kirk Lund <kl...@apache.org> wrote: > > > Thanks Jens! Let us know if you end up running dunit in parallel with > > forkevery 1 or if it's completely serial (one at a time). I think > parallel > > with docker with forkevery 1 is going to give us the best results but I'm > > sure you already know that since you're the docker king. > > > > This change would eliminate memory pollution. Any test pollution > remaining > > after this should be caused by dunit writing to non-TemporaryFolder > > directories such as current working directory or user's home directory. > > > > On Thu, Dec 14, 2017 at 4:08 PM, Jens Deppe <jde...@pivotal.io> wrote: > > > > > I'll do that. > > > > > > On Thu, Dec 14, 2017 at 2:02 PM, Kirk Lund <kl...@apache.org> wrote: > > > > > > > Someone needs to try it out. If there are any build-engineer types > > > working > > > > on geode, then my suggestion is for them to try this change and > report > > > the > > > > timing difference. > > > > > > > > On Tue, Dec 12, 2017 at 2:22 PM, Alexander Murmann < > > amurm...@pivotal.io> > > > > wrote: > > > > > > > > > Do we have a rough idea how forking every time would impact how > long > > > > tests > > > > > run? > > > > > > > > > > On Tue, Dec 12, 2017 at 1:39 PM, Kirk Lund <kl...@apache.org> > wrote: > > > > > > > > > > > We should just change to fork every 1 instead of 30. Wasting time > > > > trying > > > > > to > > > > > > debug statics is well... it's a waste of time. We should be > focused > > > on > > > > > > other things. > > > > > > > > > > > > On Mon, Dec 11, 2017 at 9:05 PM, Jinmei Liao <jil...@pivotal.io> > > > > wrote: > > > > > > > > > > > > > It doesn't call as much static methods as > > > JUnit4DistributedTestCase. > > > > > > > tearDownVM, > > > > > > > see MemberStarterRule.after(). > > > > > > > > > > > > > > On Mon, Dec 11, 2017 at 4:36 PM, Dan Smith <dsm...@pivotal.io> > > > > wrote: > > > > > > > > > > > > > > > I don't think we are trying to reuse the distributed system > - > > it > > > > > gets > > > > > > > > disconnected after each test. See JUnit4DistributedTestCase. > > > > > > tearDownVM. > > > > > > > > > > > > > > > > Are the new junit rules also cleaning things up? > > > > > > > > > > > > > > > > -Dan > > > > > > > > > > > > > > > > On Mon, Dec 11, 2017 at 4:16 PM, Kirk Lund <kl...@apache.org > > > > > > wrote: > > > > > > > > > > > > > > > > > Is there a reason we can't change DistributedTestCase and > > > > > subclasses > > > > > > to > > > > > > > > use > > > > > > > > > TemporaryFolder for all artifacts? > > > > > > > > > > > > > > > > > > We could also disconnectAllFromDS in @AfterClass (or even > > > @After) > > > > > to > > > > > > > get > > > > > > > > > things a bit more separate between dunit test classes. > > > > > > > > > > > > > > > > > > Running dunit tests in parallel is much more important than > > > > trying > > > > > to > > > > > > > > reuse > > > > > > > > > distributed system across multiple dunit tests. The latter > > just > > > > > isn't > > > > > > > > worth > > > > > > > > > the headache and trouble that it causes when static vars or > > > > > constants > > > > > > > or > > > > > > > > > disk artifacts pollute later tests. > > > > > > > > > > > > > > > > > > On Mon, Dec 11, 2017 at 1:42 PM, Dan Smith < > > dsm...@pivotal.io> > > > > > > wrote: > > > > > > > > > > > > > > > > > > > One other thing you can do is look for the below line in > > the > > > > logs > > > > > > of > > > > > > > > your > > > > > > > > > > failure. These are the tests that ran in the same JVM as > > your > > > > > > tests. > > > > > > > > This > > > > > > > > > > won't help if your tests are getting messed up by disk > > > > artifacts > > > > > or > > > > > > > > port > > > > > > > > > > issues, but if it is some JVM state left by a previous > test > > > it > > > > > > would > > > > > > > be > > > > > > > > > in > > > > > > > > > > this list. > > > > > > > > > > > > > > > > > > > > Previously run tests: [ClientServerMiscSelectorDUnitT > est, > > > > > > > > > > ClientConflationDUnitTest, ReliableMessagingDUnitTest] > > > > > > > > > > > > > > > > > > > > On Mon, Dec 11, 2017 at 1:14 PM, Jens Deppe < > > > > > jensde...@apache.org> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > I've recently debugged various distributed tests which > > fail > > > > as > > > > > a > > > > > > > > result > > > > > > > > > > of > > > > > > > > > > > prior tests not cleaning up enough. It's somewhat > painful > > > and > > > > > > this > > > > > > > is > > > > > > > > > my > > > > > > > > > > > usual debug process: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Examine the progress.txt file to determine which > > tests > > > > ran > > > > > > > > before > > > > > > > > > my > > > > > > > > > > > failing test. > > > > > > > > > > > - I pick 20-25 of these tests and create a Suite > > > > (including > > > > > my > > > > > > > > > failing > > > > > > > > > > > test) - as these tests may have run in parallel, > it's > > > not > > > > > > clear > > > > > > > > > which > > > > > > > > > > > tests > > > > > > > > > > > would have run immediately prior to your test > > > > > > > > > > > - Run the whole suite to see if I can get my test to > > > fail > > > > > > > > > > > - Bisect or manually iterate through the tests to > see > > > > which > > > > > > one > > > > > > > is > > > > > > > > > > > causing the problem. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The last step is painful, so I've updated SuiteRunner > to > > > use > > > > a > > > > > > > > > > 'candidate' > > > > > > > > > > > test class and run this class after every other class > in > > > the > > > > > list > > > > > > > of > > > > > > > > > > > SuiteClasses. For example: > > > > > > > > > > > > > > > > > > > > > > @Suite.SuiteClasses(value = { > > > > > > > > > > > org.apache.geode.cache.snapshot. > > > > > SnapshotByteArrayDUnitTest. > > > > > > > > class, > > > > > > > > > > > org.apache.geode.cache.query.dunit. > > > > > > > > QueryDataInconsistencyDUnitTes > > > > > > > > > > > t.class, > > > > > > > > > > > org.apache.geode.cache.query.internal.index. > > > > > > > > > > > MultiIndexCreationDUnitTest.class, > > > > > > > > > > > }) > > > > > > > > > > > @SuiteRunner.Candidate(org.apache.geode.management. > > > > > > > > > > > internal.configuration.ClusterConfigDistributionDUnit > > > > > Test.class) > > > > > > > > > > > @RunWith(SuiteRunner.class) > > > > > > > > > > > public class DebugSuite { > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The Candidate is optional, but this would run the > > following > > > > > > tests: > > > > > > > > > > > > > > > > > > > > > > - SnapshotByteArrayDUnitTest > > > > > > > > > > > - *ClusterConfigDistributionDUnitTest* > > > > > > > > > > > - QueryDataInconsistencyDUnitTest > > > > > > > > > > > - *ClusterConfigDistributionDUnitTest* > > > > > > > > > > > - MultiIndexCreationDUnitTest > > > > > > > > > > > - *ClusterConfigDistributionDUnitTest* > > > > > > > > > > > > > > > > > > > > > > --Jens > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Cheers > > > > > > > > > > > > > > Jinmei > > > > > > > > > > > > > > > > > > > > > > > > > > > >