[ https://issues.apache.org/jira/browse/LUCENE-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125001#comment-13125001 ]
Dawid Weiss commented on LUCENE-3492: ------------------------------------- A word of warning: this will be a longer comment. I still hope somebody will read it ;) I've written a somewhat largish chunk of code that provides an infrastructure to run "randomized", but "repeatable" tests. I'd like to report on my impressions so far. Robert was right that a custom runner provides more flexibility than a @Rule on top of the default JUnit runner (which changes depending where you run it -- ant, maven, Eclipse, etc.). I've spent a lot of time inspecting the current implementation inside JUnit and I came to the conclusion that it really is best to have a full reimplementation of the Runner interface. Full meaning not descending ParentRunner, but implementing the whole runner from scratch. This provides additional, uhm, unexpected benefits in that one can add new functionality that "regular" JUnit runners don't have and _still_ be compatible with hosting environments such as Ant, Maven or Eclipse (because they, thank God, respect @RunWith). Among the things I have implemented so far that are missing or different in JUnit are: - There is a "context" object which is accessible via thread local, so @BeforeClass and other suite-level hooks can actually access the suite class, inspect it, check conditions, whatever (the runner's random seed is also passed via this context). This is useful, but not crucial. - I've decided to deviate from JUnit strict policy of having public hook methods. By default this only causes headaches when one shadows or overrides a hook in the parent class and it is no longer invoked. A better (different) idea is to declare hooks as private; no shadowing occurs and they will all get invoked in a contractual predefined order (befores - super to class, afters - class to super). - I've added additional suite-level annotations. @Listeners provides listeners automatically hooked to RunListener. @Validators hooks up additional validators for verifying extra restrictions. An example of such a restriction is bailing out the test suite if shadowed or overridden methods exist in the class hierarchy of a suite class. Another (that I have implemented) is a validator checking for non-annotated testXXX methods that are dead JUnit3 test cases. You get the idea. A lot of code then simply vanishes from LTC; I can envision it having this shape: {code} @Listeners({ StandardErrorInfoRunListener.class}) @Validators({ NoHookMethodShadowing.class, NoTestMethodOverrides.class, NoJUnit3TestMethods.class}) public abstract class LuceneTestCase extends RandomizedTest { ... } {code} Some of these things are currently verified using a state machine (calling super() in overridden methods), but this just looks better to me to take away this concern elsewhere rather than implement it inside LTC. - The entire lifecycle of handling test method calls and hooks is controlled in the runner. I made a design decision to _not_ follow JUnit's insane wrap-wrap-wrap-exception style but instead report all exceptions that happen anywhere in the lifecycle. So if you get an exception in the test case, followed by an exception in @After, followed by an exception in @AfterClass, all these exceptions will be reported separately to the RunListener and in effect to all listening objects (in the lifecycle-corresponding order!). Such an implementation does work with fine with ANT JUnit reports, maven reports and in Eclipse (all exceptions are included) so far as I can tell -- didn't check other environments like NetBeans or IntelliJ. Again: in my personal opinion this is a much clearer way of dealing with exceptions in the lifecycle of JUnit test case compared to wrapping them into artificial exceptions (MultipleException being a supreme example) or suppressing them altogether. - I couldn't resist a tiny tweak of making any exceptions thrown from hooks or test methods carry the information about the seed used in their execution (both runner-level and method-level, even though the latter could be derived from the former). There is no easy way to do it because Throwables are designed not to allow changes to their content once constructed. With the exception of stack traces :) So I simply inject a debugging info inside the stack trace as an artificial entry; what it looks like is here, for instance: {noformat} java.lang.Error: Blah blah exception message. at __randomizedtesting.SeedInfo.seed([60BDF6E574486C2:60BDF6E76C930BC]:0) at […].examples.TestStackAugmentation$Nested.testMethod1(TestStackAugmentation.java:29) {noformat} (Note how the seed info is inside the file position of StackTraceEntry object.). This may seem like overly clever solution, but I've had it many times that sysouts got discarded or lost somehow and an exception object along with the stack trace is always there in front of your eyes. Another way to capture-and-dump reproduction info is to use @Listeners annotation above; this can be used for much what LTC does today -- -D…, -D…, -D... - A custom runner can have custom implementation of the contractual "events", such as assumptions or ignore triggers. This takes away a lot of code related to trying to get around JUnit's API limitations (assume without message/cause, method filtering and dynamic ignores based on extra conditions like @Nightly, etc.). In short: I'm really happy with a custom Runner. As for the infrastructure for writing randomized test cases: - There is currently one "master" seed that the runner either generates randomly or accepts as a global constant. Everything else: method shuffling, initial random instance for each test case (method repetition)… really everything is based on sequential calls to this generator. This has advantages and disadvantages I guess (read about static initializers below), but it was my personal desire to implement it this way and based on my few days' worth of experience with this code, it works great. - I've written a base class RandomizedTest that extends Assert and has a number of utility methods for picking random numbers or objects from collections. There is no passing of explicit _Random_ instances around like it is done currently in LTC though. The base class accesses the context's Random (which it is assigned by the runner) and then uses this random consistently to generate pseudo-randomness in selection of attributes and iterations. Of course once you go multi-threaded this will all go to dust, but I imagine multi-threaded tests shouldn't use the base class's randomness (a test case based on race conditions won't be repeatable anyway). If anything, generate per-thread Randoms based on current seed and let each thread handle its own sequence of pseudo-random numbers from there. This is even possible at runtime with non-mock objects as I'm going to show in Barcelona, hopefully. Now… if you're still with me you're probably interested how this applies to Lucene. The wall I've hit is the sheer amount of code that any change to LTC affects. I realized it'd be large, but it's just gargantuan :) The major issue is with static initializers and static public methods called from them that leave resources behind. I'm sorry, but nobody can convince me this isn't evil. I understand certain things are costly and require a one-time setup, but these should really be moved to @BeforeClass fixture hooks. If one really needs to do things once at JVM lifespan level a @BeforeClass with some logic to perform a single initialization can be a replacement for a static initializer (even if it's unclear to me when exactly such a fixture would be really needed). In short: the problem with static initializers is that they are executed outside the lifecycle control of the runner… I'd say most of the problems and current patchy solutions inside LTC (dealing with resource tracking for example) are somehow related to the fact that static initializers and static method calls are used throughout the codebase. I am currently wondering if it's feasible to provide a single patch that will make a drop-in replacement of LTC. It may be the case that adding another skeleton class based on the "new" infrastructure and rewriting tests one by one to use it may be a more sensitive/ sensible way to go. The runner (alone) is currently at github if you care to take a look. I think Barcelona may be a good place to talk about this face to face and decide what to do with it. I'm myself leaning towards the: have parallel base classes and port existing tests in chunks. > Extract a generic framework for running randomized tests. > --------------------------------------------------------- > > Key: LUCENE-3492 > URL: https://issues.apache.org/jira/browse/LUCENE-3492 > Project: Lucene - Java > Issue Type: Improvement > Components: general/test > Reporter: Dawid Weiss > Assignee: Dawid Weiss > Priority: Minor > Fix For: 4.0 > > Attachments: Screen Shot 2011-10-06 at 12.58.02 PM.png > > > I love the idea of randomized testing. Everyone (we at CarrotSearch, Lucene > and Solr folks) have their glue to make it possible. The question is if > there's something to pull out that others could share without having the need > to import Lucene-specific classes. > The work on this issue is on my github account (lots of experiments): > https://github.com/dweiss/randomizedtesting > Or directly: git clone git://github.com/dweiss/randomizedtesting.git -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org