[jira] [Commented] (LUCENE-3492) Extract a generic framework for running randomized tests.

Dawid Weiss (Commented) (JIRA) Tue, 11 Oct 2011 06:07:41 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125001#comment-13125001
 ]


Dawid Weiss commented on LUCENE-3492:
-------------------------------------

A word of warning: this will be a longer comment. I still hope somebody will 
read it ;)

I've written a somewhat largish chunk of code that provides an infrastructure 
to run "randomized", but "repeatable" tests. I'd like to report on my 
impressions so far.

Robert was right that a custom runner provides more flexibility than a @Rule on 
top of the default JUnit runner (which changes depending where you run it -- 
ant, maven, Eclipse, etc.). I've spent a lot of time inspecting the current 
implementation inside JUnit and I came to the conclusion that it really is best 
to have a full reimplementation of the Runner interface.  Full meaning not 
descending ParentRunner, but implementing the whole runner from scratch. This 
provides additional, uhm, unexpected benefits in that one can add new 
functionality that "regular" JUnit runners don't have and _still_ be compatible 
with hosting environments such as Ant, Maven or Eclipse (because they, thank 
God, respect @RunWith). 

Among the things I have implemented so far that are missing or different in 
JUnit are:
- There is a "context" object which is accessible via thread local, so 
@BeforeClass and other suite-level hooks can actually access the suite class, 
inspect it, check conditions, whatever (the runner's random seed is also passed 
via this context). This is useful, but not crucial.
- I've decided to deviate from JUnit strict policy of having public hook 
methods. By default this only causes headaches when one shadows or overrides a 
hook in the parent class and it is no longer invoked. A better (different) idea 
is to declare hooks as private; no shadowing occurs and they will all get 
invoked in a contractual predefined order (befores - super to class, afters - 
class to super).
- I've added additional suite-level annotations. @Listeners provides listeners 
automatically hooked to RunListener. @Validators hooks up additional validators 
for verifying extra restrictions. An example of such a restriction is bailing 
out the test suite if shadowed or overridden methods exist in the class 
hierarchy of a suite class. Another (that I have implemented) is a validator 
checking for non-annotated testXXX methods that are dead JUnit3 test cases. You 
get the idea. A lot of code then simply vanishes from LTC; I can envision it 
having this shape:
{code}
@Listeners({
  StandardErrorInfoRunListener.class})
@Validators({
  NoHookMethodShadowing.class,
  NoTestMethodOverrides.class,
  NoJUnit3TestMethods.class})
public abstract class LuceneTestCase extends RandomizedTest {
  ...
}
{code}
Some of these things are currently verified using a state machine (calling 
super() in overridden methods), but this just looks better to me to take away 
this concern elsewhere rather than implement it inside LTC.
- The entire lifecycle of handling test method calls and hooks is controlled in 
the runner. I made a design decision to _not_ follow JUnit's insane 
wrap-wrap-wrap-exception style but instead report all exceptions that happen 
anywhere in the lifecycle. So if you get an exception in the test case, 
followed by an exception in @After, followed by an exception in @AfterClass, 
all these exceptions will be reported separately to the RunListener and in 
effect to all listening objects (in the lifecycle-corresponding order!). Such 
an implementation does work with fine with ANT JUnit reports, maven reports and 
in Eclipse (all exceptions are included) so far as I can tell -- didn't check 
other environments like NetBeans or IntelliJ. Again: in my personal opinion 
this is a much clearer way of dealing with exceptions in the lifecycle of JUnit 
test case compared to wrapping them into artificial exceptions 
(MultipleException being a supreme example) or suppressing them altogether.
- I couldn't resist a tiny tweak of making any exceptions thrown from hooks or 
test methods carry the information about the seed used in their execution (both 
runner-level and method-level, even though the latter could be derived from the 
former).  There is no easy way to do it because Throwables are designed  not to 
allow changes to their content once constructed. With the exception of stack 
traces :) So I simply inject a debugging info inside the stack trace as an 
artificial entry; what it looks like is here, for instance:
{noformat}
java.lang.Error: Blah blah exception message.
        at 
__randomizedtesting.SeedInfo.seed([60BDF6E574486C2:60BDF6E76C930BC]:0)
        at 
[…].examples.TestStackAugmentation$Nested.testMethod1(TestStackAugmentation.java:29)
{noformat}
(Note how the seed info is inside the file position of StackTraceEntry 
object.). This may seem like overly clever solution, but I've had it many times 
that sysouts got discarded or lost somehow and an exception object along with 
the stack trace is always there in front of your eyes. Another way to 
capture-and-dump reproduction info is to use @Listeners annotation above; this 
can be used for much what LTC does today -- -D…, -D…, -D...
- A custom runner can have custom implementation of the contractual "events", 
such as assumptions or ignore triggers. This takes away a lot of code related 
to trying to get around JUnit's API limitations (assume without message/cause, 
method filtering and dynamic ignores based on extra conditions like @Nightly, 
etc.).

In short: I'm really happy with a custom Runner.

As for the infrastructure for writing randomized test cases:
- There is currently one "master" seed that the runner either generates 
randomly or accepts as a global constant. Everything else: method shuffling, 
initial random instance for each test case (method repetition)… really 
everything is based on sequential calls to this generator. This has advantages 
and disadvantages I guess (read about static initializers below), but it was my 
personal desire to implement it this way and based on my few days' worth of 
experience with this code, it works great.
- I've written a base class RandomizedTest that extends Assert and has a number 
of utility methods for picking random numbers or objects from collections. 
There is no passing of explicit _Random_ instances around like it is done 
currently in LTC though. The base class accesses the context's Random (which it 
is assigned by the runner) and then uses this random consistently to generate 
pseudo-randomness in selection of attributes and iterations. Of course once you 
go multi-threaded this will all go to dust, but I imagine multi-threaded tests 
shouldn't use the base class's randomness (a test case based on race conditions 
won't be repeatable anyway). If anything, generate per-thread Randoms based on 
current seed and let each thread handle its own sequence of pseudo-random 
numbers from there. This is even possible at runtime with non-mock objects as 
I'm going to show in Barcelona, hopefully.


Now… if you're still with me you're probably interested how this applies to 
Lucene. The wall I've hit is the sheer amount of code that any change to LTC 
affects. I realized it'd be large, but it's just gargantuan :) 

The major issue is with static initializers and static public methods called 
from them that leave resources behind. I'm sorry, but nobody can convince me 
this isn't evil. I understand certain things are costly and require a one-time 
setup, but these should really be moved to @BeforeClass fixture hooks. If one 
really needs to do things once at JVM lifespan level a @BeforeClass with some 
logic to perform a single initialization can be a replacement for a static 
initializer (even if it's unclear to me when exactly such a fixture would be 
really needed). In short: the problem with static initializers is that they are 
executed outside the lifecycle control of the runner… I'd say most of the 
problems and current patchy solutions inside LTC (dealing with resource 
tracking for example) are somehow related to the fact that static initializers 
and static method calls are used throughout the codebase. 

I am currently wondering if it's feasible to provide a single patch that will 
make a drop-in replacement of LTC. It may be the case that adding another 
skeleton class based on the "new" infrastructure and rewriting tests one by one 
to use it may be a more sensitive/ sensible way to go. 

The runner (alone) is currently at github if you care to take a look. I think 
Barcelona may be a good place to talk about this face to face and decide what 
to do with it. I'm myself leaning towards the: have parallel base classes and 
port existing tests in chunks.

                
> Extract a generic framework for running randomized tests.
> ---------------------------------------------------------
>
>                 Key: LUCENE-3492
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3492
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: general/test
>            Reporter: Dawid Weiss
>            Assignee: Dawid Weiss
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: Screen Shot 2011-10-06 at 12.58.02 PM.png
>
>
> I love the idea of randomized testing. Everyone (we at CarrotSearch, Lucene 
> and Solr folks) have their glue to make it possible. The question is if 
> there's something to pull out that others could share without having the need 
> to import Lucene-specific classes.
> The work on this issue is on my github account (lots of experiments):
> https://github.com/dweiss/randomizedtesting
> Or directly: git clone git://github.com/dweiss/randomizedtesting.git

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3492) Extract a generic framework for running randomized tests.

Reply via email to