Re: [Important] New test framework, differences, features, etc.

Yonik Seeley Sun, 15 Apr 2012 07:59:21 -0700

Great stuff Dawid!
The tests.iters functionality sounds like it will be particularly useful.


-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10



On Sun, Apr 15, 2012 at 10:45 AM, Dawid Weiss <[email protected]> wrote:
> Ah... such a relief to finally have it committed even if this means
> the work has just begun on straightening it all out. So.
>
> 1) WHAT'S BEEN DONE?
>
> The primary objective was to pull out "randomized" runner from Lucene
> and make it available as a separate project for reuse in other
> projects. This kind of grew into two things:
>
> a) RandomizedRunner - a runner that has built-in support for
> randomization and other interesting features,
> b) junit4-ant - an ANT task for running JUnit4 tests in parallel JVMs,
> with an aggregation of events, balancing, logs, different (from
> standard ANT) reporting and forked JVM crash resilience.
>
> Everything above is heavily covered with unit tests (the code is on
> github here: https://github.com/carrotsearch/randomizedtesting).
>
> LUCENE-3808 removes LuceneTestCaseRunner, replacing it with
> RandomizedRunner. It also modifies build scripts to run junit4-ant
> instead of ANT's default <junit>.
>
> 2) HOW DOES IT AFFECT ME AS A LUCENE/ SOLR DEVELOPER?
>
> - The most visible change is that 'random' field in LuceneTestCase is
> gone. This change was motivated by the fact that the field's value was
> read from places where it shouldn't be read, passed to places where it
> shouldn't be passed, etc. Instead, the Random instance for a given
> scope (see below) can be acquired from a static method in
> LuceneTestCase called random(). In the essence, you can just add
> brackets around your previous random field references and it _should_
> work out of the box. There are differences though: Random object
> returned by random() is valid only for the scope it was created for.
> So any of the following will end up in an exception: saving a Random
> instance in a static scope (@BeforeClass) to a field and reusing it in
> a test, passing a Random instance from one thread to another, saving a
> Random instance to a field in one test, using it in another, etc. In
> short: the result of random() is per-thread and only valid for the
> scope (test method, hook) it was acquired for. You _can_ call random()
> from non-test threads -- they will be given their own thread-local
> Random instances.
>
> - The 'random seed' is a single HEX-encoded long. The "three seeds"
> from before are gone. Everything is a derivative of the initial master
> seed.
>
> - I provided a 'help on syntax' for test properties. Type:
>
> ant test-help
>
> and the most common use case scenarios will be dumped to your console.
>
> - A notable difference is that 'tests.iter' property has been renamed
> to 'tests.iters' (you'll get a build failure and a message if you try
> to use the former one). I could add a fallback but I'd rather not
> introduce any more aliases.
>
> - "tests.iters" is no longer a poor-man's loop. It really re-runs a
> duplicate of a given test (or tests), including any @Before/@After
> hooks and setups. In theory, this means it is now possible to
> reiterate ANY test, no matter how complex. If it doesn't depend on
> static state, it can be repeated. This also links to how seed is used.
> It's probably best explained on an example:
>
> ant -Dtests.seed=deadbeef -Dtests.iters=3
> -Dtestcase=TestSubScorerFreqs test-core
>
> the above will result in 3 executions of every test in
> TestSubScorerFreqs. What you'll get on the console is:
>
>   [junit4] <JUnit4> says hello. Random seed: deadbeef
>   [junit4] Expected execution time on JVM J0:     0.02s
>   [junit4] Executing 1 suite with 1 JVM.
>   [junit4] Running org.apache.lucene.search.TestSubScorerFreqs
>   [junit4] Tests run:   9, Failures:   0, Errors:   0, Skipped:   0,
> Time:  0.45s
>   [junit4]
>   [junit4] JVM J0:     0.36 ..     0.96 =     0.60s
>   [junit4] Execution time total: 0.98 sec.
>   [junit4] Tests summary: 1 suite, 9 tests
>
> and if you peek at the log file with test results
> (build/core/test/tests-report.txt) you'll see the details of each
> executed test:
>
> Executing 1 suite with 1 JVM.
> Running org.apache.lucene.search.TestSubScorerFreqs
> OK      0.04s | TestSubScorerFreqs.testTermQuery {#0
> seed=[DEADBEEF:BAE943E3CC27A0F]}
> OK      0.01s | TestSubScorerFreqs.testTermQuery {#1
> seed=[DEADBEEF:BFF828C20800B123]}
> OK      0.01s | TestSubScorerFreqs.testTermQuery {#2
> seed=[DEADBEEF:3111BE1E59C4F9E8]}
> OK      0.02s | TestSubScorerFreqs.testBooleanQuery {#0
> seed=[DEADBEEF:3DDFB93BCC712A41]}
> OK      0.01s | TestSubScorerFreqs.testBooleanQuery {#1
> seed=[DEADBEEF:898905C7F8B3E16D]}
> OK      0.01s | TestSubScorerFreqs.testBooleanQuery {#2
> seed=[DEADBEEF:760931BA977A9A6]}
> OK      0.01s | TestSubScorerFreqs.testPhraseQuery {#0
> seed=[DEADBEEF:A7ADE8DE1DB7CA0B]}
> OK      0.01s | TestSubScorerFreqs.testPhraseQuery {#1
> seed=[DEADBEEF:13FB542229750127]}
> OK      0.01s | TestSubScorerFreqs.testPhraseQuery {#2
> seed=[DEADBEEF:9D12C2FE78B149EC]}
> Tests run:   9, Failures:   0, Errors:   0, Skipped:   0, Time:  0.45s
>
> Note that tests.iters=3 resulted in every test case executed three
> times, but they all count individually (so the reported total is 9
> tests). What's also clearly seen is that the master seed is constant
> for all tests but each repetition gets a (predictable) derivative of
> the random seed. This way you can reiterate N times over a test, each
> time with a different seed. Now, compare this to:
>
> ant -Dtests.seed=deadbeef:cafebabe -Dtests.iters=3
> -Dtestcase=TestSubScorerFreqs test-core
>
> The log file now will be:
>
> Executing 1 suite with 1 JVM.
> Running org.apache.lucene.search.TestSubScorerFreqs
> OK      0.04s | TestSubScorerFreqs.testTermQuery {#0 seed=[DEADBEEF:CAFEBABE]}
> OK      0.01s | TestSubScorerFreqs.testTermQuery {#1 seed=[DEADBEEF:CAFEBABE]}
> OK      0.01s | TestSubScorerFreqs.testTermQuery {#2 seed=[DEADBEEF:CAFEBABE]}
> OK      0.02s | TestSubScorerFreqs.testBooleanQuery {#0
> seed=[DEADBEEF:CAFEBABE]}
> OK      0.01s | TestSubScorerFreqs.testBooleanQuery {#1
> seed=[DEADBEEF:CAFEBABE]}
> OK      0.01s | TestSubScorerFreqs.testBooleanQuery {#2
> seed=[DEADBEEF:CAFEBABE]}
> OK      0.01s | TestSubScorerFreqs.testPhraseQuery {#0 
> seed=[DEADBEEF:CAFEBABE]}
> OK      0.01s | TestSubScorerFreqs.testPhraseQuery {#1 
> seed=[DEADBEEF:CAFEBABE]}
> OK      0.01s | TestSubScorerFreqs.testPhraseQuery {#2 
> seed=[DEADBEEF:CAFEBABE]}
> Tests run:   9, Failures:   0, Errors:   0, Skipped:   0, Time:  0.46s
>
> We now fixed both the master and per-method seeds. Every repetition
> will have the same randomness. Test failures will report test error
> seeds as "master:test" so you will be able to use both scenarios,
> depending on the use case.
>
> - "tests.iter.min" has been temporarily removed. I will think of a way
> to reintegrate it.
>
> - The console output (sysout/systerrs) is only emitted to ANT's output
> if a test/suite fails. If a test/suite passes, its output is
> redirected in full to "tests-report.txt" file for inspection if
> needed.
>
> 3) WHAT ARE THE OTHER GOODIES?
>
> There are many... For one thing, the console output will be much less
> chatty. Typically, one will see something like this:
>
>   [junit4] <JUnit4> says hello. Random seed: 71F222160952E78A
>   [junit4] Expected execution time on JVM J0:    93.93s
>   [junit4] Expected execution time on JVM J1:    93.93s
>   [junit4] Expected execution time on JVM J2:    93.93s
>   [junit4] Expected execution time on JVM J3:    93.93s
>   [junit4] Executing 296 suites with 4 JVMs.
>   [junit4] Running org.apache.lucene.index.TestTermVectorsWriter
>   [junit4] Tests run:  13, Failures:   0, Errors:   0, Skipped:   0,
> Time:  1.31s
>   [junit4]
>   [junit4] Running org.apache.lucene.index.TestStressIndexing2
>   [junit4] Tests run:   3, Failures:   0, Errors:   0, Skipped:   0,
> Time:  1.89s
> ...
>
> The number of JVMs used depends on your number of cores (and is not
> equal to their number to leave some room for infrastructure). ant
> test-help contains instructions how to override the number of JVMs to
> make the executions sequential, for example.
>
> Another thing is that tests (their suite, output, etc.) that _failed_
> are appended to tests-failures.txt. I remember that somebody once
> complained that it's impossible to find a test failure in ANT's log.
> Now it should be easier. I made a test fail; the output is like this
> (console):
>
>   [junit4] Running org.apache.lucene.search.TestSubScorerFreqs
>   [junit4] FAILURE 0.06s | TestSubScorerFreqs.testBooleanQuery
>   [junit4]    > Throwable #1: java.lang.AssertionError
>   [junit4]    >        at
> __randomizedtesting.SeedInfo.seed([B2258B2DE68D190A:8FFA3216F4518DA4]:0)
>   [junit4]    >        at org.junit.Assert.fail(Assert.java:92)
>   [junit4]    >        at org.junit.Assert.fail(Assert.java:100)
>   [junit4]    >        at
> org.apache.lucene.search.TestSubScorerFreqs.testBooleanQuery(TestSubScorerFreqs.java:156)
> ....[snip]
>   [junit4]    >
>   [junit4]   2> NOTE: reproduce with: ant test
> -Dtests.class=*.TestSubScorerFreqs
> -Dtests.filter.method=testBooleanQuery -Dtests.seed=B2258B2DE68D190A
> -Dargs="-Dfile.encoding=Cp1252"
>   [junit4]   2> NOTE: reproduce with: ant test
> -Dtests.filter=*.TestSubScorerFreqs
> -Dtests.filter.method=testBooleanQuery -Dtests.seed=B2258B2DE68D190A
> -Dargs="-Dfile.encoding=Cp1252"
>   [junit4]   2>
>   [junit4]    > (@AfterClass output)
>   [junit4]   2> NOTE: test params are: codec=Lucene40:
> {f=PostingsFormat(name=MockSep)}, sim=DefaultSimilarity, locale=da,
> timezone=Asia/Bishkek
>   [junit4]   2> NOTE: all tests run in this JVM:
>   [junit4]   2> [TestSubScorerFreqs, TestSubScorerFreqs]
>   [junit4]   2> NOTE: Windows 7 6.1 amd64/Oracle Corporation 1.7.0_03
> (64-bit)/cpus=8,threads=1,free=97723464,total=128647168
>   [junit4]   2>
>   [junit4] Tests run:   3, Failures:   1, Errors:   0, Skipped:   0,
> Time:  0.45s <<< FAILURES!
>
> Note "<<< FAILURES!" -- this is something to grep for. The " >" marker
> is a stack trace from the throwable, "2>" is standard error, "1>" is
> standard output. The same error is also appended to
> tests-failures.txt. The stack trace contains a synthetic entry with
> the master seed and the test's seed.
>
> For those who prefer more visual (color) reports -- look for
> "tests-report-{module-name}" folder under the build folder. This
> contains index.html with a test report that should be easier to
> navigate around. An example of what such a report looks like is here:
>
> http://builds.carrot2.org/browse/C2HEAD-SOURCES/latest/artifact/JOB1/JUnit-Test-Report/index.html
>
> Yet another thing is that test suites are load-balanced across forked
> JVMs (with job stealing in place). For the initial balancing there are
> precomputed statistics (which should be updated from time to time!).
> Any performance improvement this yields is currently eaten away by the
> cost of random() calls from tight loops in certain tests (it's better
> to just cache the result in a local variable). I'll try to pinpoint
> these later on. The timings shouldn't be much different to what it was
> before so it's not a performance drop.
>
> I don't know what else... I'm sure there are things that I haven't
> covered. Ask if something is not clear or something is not working.
>
> Dawid
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [Important] New test framework, differences, features, etc.

Reply via email to