John,
Sometime in January, we are going to spend some time fixing a few minor MPL
bugs we've hit and a probably work on a few enhancements (I'll send you a list
in Jan before we start anything - it's nothing major). We're also going to
work on writing a set of tests that try various plots w/ units. I was thinking
this would be a good time to introduce a standard test harness into the MPL CM
tree.
I think we should:
1) Select a standard test harness. The two big hitters seem to be unittest and
nose. unittest has the advantage that it's shipped w/ Python. nose seems to
do better with automatic discovery of test cases.
2) Establish a set of testing requirements. Naming conventions, usage
conventions, etc. Things like tests should never print anything to the screen
(i.e. correct behavior is encoded in the test case) or rely on a GUI unless
that's what is being tested (allows tests to be run w/o an X-server).
Basically write some documentation for the test system that includes how to use
it and what's required of people when they add tests.
3) Write a test 'template' for people to use. This would define a test case
and put TODO statements or something like it in place for people to fill in.
More than one might be good for various classes of tests (maybe an image
comparison template for testing agg drawing and a non-plot template for testing
basic computations like transforms?).
Some things we do on my project for our Python test systems:
We put all unit tests in a 'test' directory inside the python package being
tested. The disadvantage of this is that potentially large tests are inside
the code to be delivered (though a nice delivery script can easily strip them
out). The advantage of this is that it makes coverage checking easier. You
can run the test case for a package and then check the coverage in the module
w/o trying to figure out which things should be coverage checked or not. If
you put the test cases in a different directory tree, then it's much harder to
identify coverage sources. Though in our case we have 100's of python modules
- in MPL's case, there is really just MPL, projections, backends, and numerix
so maybe that's not too much of a problem.
Automatic coverage isn't something that is must have, but it is really nice.
I've found that it actually causes developers to write more tests because they
can run the coverage and get a "score" that other people will see. It's also a
good way to check a new submission to see if the developer has done basic
testing of the code.
For our tests, we require that the test never print anything to the screen,
clean up any of its output files (i.e. leave the directory in the same state it
was before), and only report that the test passed or failed and if it failed,
add some error message. The key thing is that the conditions for correctness
are encoded into the test itself. We have a command line option that gets
passed to the test cases to say "don't clean up" so that you can examine the
output from a failing test case w/o modifying the test code. This option is
really useful when an image comparison fails.
We've wrapped the basic python unittest package. It's pretty simple and
reasonably powerful. I doubt there is anything MPL would be doing that it
can't handle. The auto-discovery of nose is nice but unnecessary in my
opinion. As long as people follow a standard way of doing things,
auto-discovery is fairly easy. Of course if you prefer nose and don't mind the
additional tool requirement, that's fine too. Some things that are probably
needed:
- command line executable that runs the tests.
- support flags for running only some tests
- support flags for running only tests that don't need a GUI backend
(require Agg?). This allows automated testing and visual testing to
be
combined. GUI tests could be placed in identified directories and
then
only run when requested since by their nature they require specific
backends
and user interaction.
- nice report on test pass/fail status
- hooks to add coverage checking and reporting in the future
- test utilities
- image comparison tools
- ??? basically anything that helps w/ testing and could be common
across
test cases
As a first cut, I would suggest is something like this:
.../test/run.py
mplTest/
test_unit/
test_transform/
test_...
The run script would execute all/some of the tests. Any common test code would
be put in the mplTest directory. Any directory named 'test_XXX' is for test
cases where 'XXX' is some category name that can be used in the run script to
run a subset of cases. Inside each test_XXX directory, one unittest class per
file. The run script would find the .py files in the test_XXX directories,
import them, find all the unittest classes, and run them. The run script also
sets