Re: [Tutor] unittest with random population data

Sydney Shall Tue, 02 Jun 2015 20:19:04 -0700

On 31/05/2015 03:00, Cameron Simpson wrote:

On 30May2015 12:16, Sydney Shall <s.sh...@virginmedia.com> wrote:

Following advice from you generous people, I have chosen a project
>that interests me, to develop some knowledge of python.
My projest is a simulation of a biological population.
I have a base class and a simulation function, which uses instances of
the class.
This, after many months of work and lots of advice, now seems to work
well. It generates sensible data and when I write a small test program
it gives sensible output.
Now I want to learn to use unittest.
I have written a unittest class which works OK.
But the problem I have is that because I use the random module to
populate my initial arrays, my data is not strictly predictable even
though I am using seed(0). So the tests return many *fails* because
the numbers are not exactly correct, although they are all rather
close, consistent with the sigma value I have chosen for the spread of
my population. I do of course use *almostEqual* and not *Equal*.


First of all, several people have posted suggestions for getting
identical results on every run.

However, there is another approach, which you might consider. (And use
in addition, not inseadt of, the reproducable suggestions).

It is all very well to have a unit test that runs exactly the same with
a test set of data - it lets you have confidence that algorithm changes
do not change the outcome. But on for that data set.

You say that your results are "all rather close, consistent with the sigma
value I have chosen for the spread of my population". I would advocate
making some "contraint" tests that verify this property for _any_ input
data set.

Then you can run with random and _changing_ input data sets to verify
that your code produces the expected _kind_ of results with many data sets.

So you would have one test which ran with a fixed data set which
confirms preidctable unchanging results. And you have other tests with
run with randomly chosen data and confirms that outcomes fall within the
parameters you expect. You can apply those checks ("outcome in range")
to both sets of tests.

As an exmaple, I have a few classes which maintain data structures which
are sensitive to boundary conditions. The glaring example is a numeric
range class which stores contiguous ranges efficiently (a sequence of
(low,high) pairs). It has a few add/remove operations which are meant to
maintain that sequence on ordered minimal form. cutting and merging
adjacent ranges is very easy to get wrong, very sensitive to off-by-one
logic errors.

So my tests for this class include some random tests which do random
unpredictable add/remove operations, and run a consistency check on the
object after each operation. This gives me good odds of exercising some
tricky sequence which I have not considered explicitly myself.

You can see the test suite here:

  https://bitbucket.org/cameron_simpson/css/src/tip/lib/python/cs/range_tests.py

It has a bunch of explicit specific tests up the top, and then the
random consistency test down the bottom as "test30random_set_equivalence".

Cheers,
Cameron Simpson <c...@zip.com.au>

MS-Word is Not a document exchange format - Jeff Goldberg
http://www.goldmark.org/netrants/no-word/attach.html

Cameron,
Thanks for your most helpful reply.

I have studied the material you indicated and it has been most helpful.I think that I have understood the principle involved, but I have hadsome problem implementing it.

The range tests are mostly clear to me but there is one aspect I cannotfollow.You use in this suite imports from Range, including Range, overlap,spans and Span.Are these programs that you have written? If so, are they specific toyour set up or are they generic? If so, is it possible to obtain theseprograms?I have established a very primitive test suite based on your cs.rangenotions and it works fine, but it would be better, I am sure, to do itproperly.

Finally, I have one comment for the respected Moderator, if he is notout on a walk in the highlands in this cold and wet weather.

I have taken the liberty of raising my problem here rather thanelsewhere, because I have observed that the biological and bio-medicalcommunity, who always come late to new notions, is now rapidlydiscovering python. A great deal of work in these fields involve eitherstochastic simulations or statistical problems of analysis. The latterare more or less straight-forward, but the simulations are not.

Thanks for all the help. You people are a model of how we could perhapscivilize humanity.




--
Sydney
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] unittest with random population data

Reply via email to