Re: [Tutor] unittest with random population data

2015-06-03 Thread Cameron Simpson

[ I've taken this discussion back to the tutor list. - Cameron ]

On 01Jun2015 18:14, Sydney Shall  wrote:

On 31/05/2015 03:00, Cameron Simpson wrote:

You say that your results are "all rather close, consistent with the sigma
value I have chosen for the spread of my population". I would advocate
making some "contraint" tests that verify this property for _any_ input
data set.

Then you can run with random and _changing_ input data sets to verify
that your code produces the expected _kind_ of results with many data sets.

[...]

range class which stores contiguous ranges efficiently (a sequence of
(low,high) pairs). It has a few add/remove operations which are meant to
maintain that sequence on ordered minimal form. cutting and merging
adjacent ranges is very easy to get wrong, very sensitive to off-by-one
logic errors.

So my tests for this class include some random tests which do random
unpredictable add/remove operations, and run a consistency check on the
object after each operation. This gives me good odds of exercising some
tricky sequence which I have not considered explicitly myself.

You can see the test suite here:
 https://bitbucket.org/cameron_simpson/css/src/tip/lib/python/cs/range_tests.py


I have studied the material you indicated and it has been most
helpful. I think that I have understood the principle involved, but I
have had some problem implementing it.

The range tests are mostly clear to me but there is one aspect I 
cannot follow.
You use in this suite imports from Range, including Range, overlap, 
spans and Span.
Are these programs that you have written? If so, are they specific to 
your set up or are they generic? If so, is it possible to obtain these 
programs?


The "cs.range" is a module of my own. As you might imagine, cs.range_tests has 
the entire purpose of testing cs.range. cs.range is here:


 https://bitbucket.org/cameron_simpson/css/src/tip/lib/python/cs/range.py

Feel free. Since it in turn has some dependencies the easiest way to get it is 
to use "pip" to install it, as pip will also fetch and insteall its 
dependencies. But if you just want something to read, fetch and enjoy.


I have established a very primitive test suite based on your cs.range >notions 

and it works fine, but it would be better, I am sure, to do it >properly.

There are many other people whose test writing ability surpasses mine.  But 
you're welcome to learn anything you can from mine, and to ask questions.  

Finally, I have one comment for the respected Moderator, if he is not 
out on a walk in the highlands in this cold  and wet weather.


I have taken the liberty of raising my problem here rather than 
elsewhere, because I have observed that the biological and bio-medical 
community, who always come late to new notions, is now rapidly 
discovering  python. A great deal of work in these fields involve 
either stochastic simulations or statistical problems of analysis. The 
latter are more or less straight-forward, but the simulations are not.


You might ask a separate question on the python-l...@python.org about 
simulations. It has a wider audience than the tutor list and may well include 
people doing simulation work, or who know where to look.


Thanks for all the help. You people are a model of how we could 
perhaps civilize humanity.


Nah. We might all be striving to be a model of how humanity might be when 
civilised though...


Cheers,
Cameron Simpson 

You my man are a danger to society and should be taken out of society for all
our sakes. As to what is done to you once removed I couldn't care less.
   - Roy G. Culley, Unix Systems Administrator
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] unittest with random population data

2015-06-03 Thread Alan Gauld

On 01/06/15 18:14, Sydney Shall wrote:


Finally, I have one comment for the respected Moderator, if he is not
out on a walk in the highlands in this cold  and wet weather.


No, that's on tomorrow's schedule :-)


I have taken the liberty of raising my problem here rather than
elsewhere, because I have observed that the biological and bio-medical
community, who always come late to new notions, is now rapidly
discovering  python. A great deal of work in these fields involve either
stochastic simulations or statistical problems of analysis. The latter
are more or less straight-forward, but the simulations are not.


I have no problem with this kind of material on tutor because, although 
the list is for "beginners to Python", that includes those with much 
experience in other languages and who may expect to use TDD from the 
beginning.


So TDD techniques using standard library modules is a valid
topic even though it may be too advanced for those beginners who
are new to programming. We need to balance coverage to cater for
both categories of "beginner".

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] unittest with random population data

2015-06-02 Thread Sydney Shall

On 31/05/2015 03:00, Cameron Simpson wrote:

On 30May2015 12:16, Sydney Shall  wrote:

Following advice from you generous people, I have chosen a project
>that interests me, to develop some knowledge of python.
My projest is a simulation of a biological population.
I have a base class and a simulation function, which uses instances of
the class.
This, after many months of work and lots of advice, now seems to work
well. It generates sensible data and when I write a small test program
it gives sensible output.
Now I want to learn to use unittest.
I have written a unittest class which works OK.
But the problem I have is that because I use the random module to
populate my initial arrays, my data is not strictly predictable even
though I am using seed(0). So the tests return many *fails* because
the numbers are not exactly correct, although they are all rather
close, consistent with the sigma value I have chosen for the spread of
my population. I do of course use *almostEqual* and not *Equal*.


First of all, several people have posted suggestions for getting
identical results on every run.

However, there is another approach, which you might consider. (And use
in addition, not inseadt of, the reproducable suggestions).

It is all very well to have a unit test that runs exactly the same with
a test set of data - it lets you have confidence that algorithm changes
do not change the outcome. But on for that data set.

You say that your results are "all rather close, consistent with the sigma
value I have chosen for the spread of my population". I would advocate
making some "contraint" tests that verify this property for _any_ input
data set.

Then you can run with random and _changing_ input data sets to verify
that your code produces the expected _kind_ of results with many data sets.

So you would have one test which ran with a fixed data set which
confirms preidctable unchanging results. And you have other tests with
run with randomly chosen data and confirms that outcomes fall within the
parameters you expect. You can apply those checks ("outcome in range")
to both sets of tests.

As an exmaple, I have a few classes which maintain data structures which
are sensitive to boundary conditions. The glaring example is a numeric
range class which stores contiguous ranges efficiently (a sequence of
(low,high) pairs). It has a few add/remove operations which are meant to
maintain that sequence on ordered minimal form. cutting and merging
adjacent ranges is very easy to get wrong, very sensitive to off-by-one
logic errors.

So my tests for this class include some random tests which do random
unpredictable add/remove operations, and run a consistency check on the
object after each operation. This gives me good odds of exercising some
tricky sequence which I have not considered explicitly myself.

You can see the test suite here:

  https://bitbucket.org/cameron_simpson/css/src/tip/lib/python/cs/range_tests.py

It has a bunch of explicit specific tests up the top, and then the
random consistency test down the bottom as "test30random_set_equivalence".

Cheers,
Cameron Simpson 

MS-Word is Not a document exchange format - Jeff Goldberg
http://www.goldmark.org/netrants/no-word/attach.html


Cameron,
Thanks for your most helpful reply.
I have studied the material you indicated and it has been most helpful. 
I think that I have understood the principle involved, but I have had 
some problem implementing it.


The range tests are mostly clear to me but there is one aspect I cannot 
follow.
You use in this suite imports from Range, including Range, overlap, 
spans and Span.
Are these programs that you have written? If so, are they specific to 
your set up or are they generic? If so, is it possible to obtain these 
programs?
I have established a very primitive test suite based on your cs.range 
notions and it works fine, but it would be better, I am sure, to do it 
properly.


Finally, I have one comment for the respected Moderator, if he is not 
out on a walk in the highlands in this cold  and wet weather.


I have taken the liberty of raising my problem here rather than 
elsewhere, because I have observed that the biological and bio-medical 
community, who always come late to new notions, is now rapidly 
discovering  python. A great deal of work in these fields involve either 
stochastic simulations or statistical problems of analysis. The latter 
are more or less straight-forward, but the simulations are not.


Thanks for all the help. You people are a model of how we could perhaps 
civilize humanity.




--
Sydney
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] unittest with random population data

2015-06-02 Thread Sydney Shall

On 02/06/2015 07:59, Steven D'Aprano wrote:

Please keep your replies on the tutor list, so that others may offer
advice, and learn from your questions.

Thanks,

Steve

On Mon, Jun 01, 2015 at 06:03:08PM +0100, Sydney Shall wrote:

On 31/05/2015 00:41, Steven D'Aprano wrote:

On Sat, May 30, 2015 at 12:16:01PM +0100, Sydney Shall wrote:


I have written a unittest class which works OK.
But the problem I have is that because I use the random module to
populate my initial arrays, my data is not strictly predictable even
though I am using seed(0).


Please show us how you populate your arrays, because what you describe
sounds wrong. Seeding to the same value should give the same sequence of
values:

py> import random
py> random.seed(0)
py> a = [random.random() for i in range(10**6)]
py> random.seed(0)
py> b = [random.random() for i in range(10**6)]
py> a == b
True



Thank you for the advice Steven.
I was of course aware that I had to use random.seed(0), which I had
done. I was puzzled by the fact that it did not give me reprocibly
results, which it did when I was learning python. But because you drew
attention to the problem, I have looked at again. I surmised that
perhaps I had put the seed statement in the wrong place. I have tried
several places, but I always get the sam spread of results.

Perhaps, to help get advice I should explain that I populate a a list thus:
  self.ucc = np.random.normal(self.mean_ucc, self.sigma_ucc, 200)
This does give me list of 200 slightly different numbers.
The mean_ucc is always 15.0 and the sigma value is always 3.75.
The actual mean and sigma of the random numbers is checked that it is
within 5.0% of 15.0 and 3.75 respectively.
Following your advice I did a little test. I repeated a little test
program that I have written, which gives me sensible and proper results.
I repeated the test 8 times and a then noted a useful output result.
When I plot the actual mean of the population used against the output
result I chose, I obtain a perfect straight line, which I should.

Now I still think that I am using the random.seed(0) either incorrectly
or in the wrong place.
If there is any other information that might clarify my problem, I will
be grateful to be told.
I would be most grateful for any guidance you may have, indicating where
I should look for what I suspect is a beginner's error.

--
Sydney




Apolgies. I thought that I had dione so. I will be more careeful infuture.

--
Sydney
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] unittest with random population data

2015-05-30 Thread Cameron Simpson

On 30May2015 12:16, Sydney Shall  wrote:

Following advice from you generous people, I have chosen a project >that 
interests me, to develop some knowledge of python.
My projest is a simulation of a biological population.
I have a base class and a simulation function, which uses instances of 
the class.
This, after many months of work and lots of advice, now seems to work 
well. It generates sensible data and when I write a small test program 
it gives sensible output.

Now I want to learn to use unittest.
I have written a unittest class which works OK.
But the problem I have is that because I use the random module to 
populate my initial arrays, my data is not strictly predictable even 
though I am using seed(0). So the tests return many *fails* because 
the numbers are not exactly correct, although they are all rather 
close, consistent with the sigma value I have chosen for the spread of 
my population. I do of course use *almostEqual* and not *Equal*.


First of all, several people have posted suggestions for getting identical 
results on every run.


However, there is another approach, which you might consider. (And use in 
addition, not inseadt of, the reproducable suggestions).


It is all very well to have a unit test that runs exactly the same with a test 
set of data - it lets you have confidence that algorithm changes do not change 
the outcome. But on for that data set.


You say that your results are "all rather close, consistent with the sigma
value I have chosen for the spread of my population". I would advocate making 
some "contraint" tests that verify this property for _any_ input data set.


Then you can run with random and _changing_ input data sets to verify that your 
code produces the expected _kind_ of results with many data sets.


So you would have one test which ran with a fixed data set which confirms 
preidctable unchanging results. And you have other tests with run with randomly 
chosen data and confirms that outcomes fall within the parameters you expect.  
You can apply those checks ("outcome in range") to both sets of tests.


As an exmaple, I have a few classes which maintain data structures which are 
sensitive to boundary conditions. The glaring example is a numeric range class 
which stores contiguous ranges efficiently (a sequence of (low,high) pairs). It 
has a few add/remove operations which are meant to maintain that sequence on 
ordered minimal form. cutting and merging adjacent ranges is very easy to get 
wrong, very sensitive to off-by-one logic errors.


So my tests for this class include some random tests which do random 
unpredictable add/remove operations, and run a consistency check on the object 
after each operation. This gives me good odds of exercising some tricky 
sequence which I have not considered explicitly myself.


You can see the test suite here:

 https://bitbucket.org/cameron_simpson/css/src/tip/lib/python/cs/range_tests.py

It has a bunch of explicit specific tests up the top, and then the random 
consistency test down the bottom as "test30random_set_equivalence".


Cheers,
Cameron Simpson 

MS-Word is Not a document exchange format - Jeff Goldberg
http://www.goldmark.org/netrants/no-word/attach.html
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] unittest with random population data

2015-05-30 Thread Steven D'Aprano
On Sat, May 30, 2015 at 12:16:01PM +0100, Sydney Shall wrote:

> I have written a unittest class which works OK.
> But the problem I have is that because I use the random module to 
> populate my initial arrays, my data is not strictly predictable even 
> though I am using seed(0).

Please show us how you populate your arrays, because what you describe 
sounds wrong. Seeding to the same value should give the same sequence of 
values:

py> import random
py> random.seed(0)
py> a = [random.random() for i in range(10**6)]
py> random.seed(0)
py> b = [random.random() for i in range(10**6)]
py> a == b
True


-- 
Steve
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] unittest with random population data

2015-05-30 Thread Laura Creighton
In a message of Sat, 30 May 2015 12:16:01 +0100, Sydney Shall writes:
>MAC OSX 10.10.3
>Enthought Python 2.7
>
>I am an almost beginner.
>
>Following advice from you generous people, I have chosen a project that 
>interests me, to develop some knowledge of python.
>My projest is a simulation of a biological population.
>I have a base class and a simulation function, which uses instances of 
>the class.
>This, after many months of work and lots of advice, now seems to work 
>well. It generates sensible data and when I write a small test program 
>it gives sensible output.
>Now I want to learn to use unittest.
>I have written a unittest class which works OK.
>But the problem I have is that because I use the random module to 
>populate my initial arrays, my data is not strictly predictable even 
>though I am using seed(0). So the tests return many *fails* because the 
>numbers are not exactly correct, although they are all rather close, 
>consistent with the sigma value I have chosen for the spread of my 
>population. I do of course use *almostEqual* and not *Equal*.
>So, I would be very grateful for guidance. How does one proceed in this 
>case? Should I simply create an arbitrary array or value to input into 
>the function that I want to test?
>I would be grateful for any guidance.
>
>-- 
>Sydney

You can mock your random number generator function to return something
that isn't random for the purposes of testing.  Mock is part of
unittest for Python 3.3 or later, but since you are on 2.7 you will
have to import mock as a separate library.

This nice blog post
http://fgimian.github.io/blog/2014/04/10/using-the-python-mock-library-to-fake-regular-functions-during-tests/

gives a whole slew of examples for how to use mock to mock the
random number generator function os.urandom .  But you can use it to mock
any function you like.

If you have trouble getting it to work, come back with code.  It is tricky
the first time you mock anything, to make sure you put the mock in
the correct place, but once you get the hang of this it is dirt simple.

Happy hacking,
Laura

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] unittest with random population data

2015-05-30 Thread Peter Otten
Sydney Shall wrote:

> MAC OSX 10.10.3
> Enthought Python 2.7
> 
> I am an almost beginner.
> 
> Following advice from you generous people, I have chosen a project that
> interests me, to develop some knowledge of python.
> My projest is a simulation of a biological population.
> I have a base class and a simulation function, which uses instances of
> the class.
> This, after many months of work and lots of advice, now seems to work
> well. It generates sensible data and when I write a small test program
> it gives sensible output.
> Now I want to learn to use unittest.
> I have written a unittest class which works OK.
> But the problem I have is that because I use the random module to
> populate my initial arrays, my data is not strictly predictable even
> though I am using seed(0). So the tests return many *fails* because the
> numbers are not exactly correct, although they are all rather close,
> consistent with the sigma value I have chosen for the spread of my
> population. I do of course use *almostEqual* and not *Equal*.
> So, I would be very grateful for guidance. How does one proceed in this
> case? Should I simply create an arbitrary array or value to input into
> the function that I want to test?
> I would be grateful for any guidance.

With the same input your program should produce exactly the same output.
Are you entering data into dicts? Try to set the

PYTHONHASHSEED 

environment variable to get reproducible results.


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] unittest with random population data

2015-05-30 Thread Sydney Shall

MAC OSX 10.10.3
Enthought Python 2.7

I am an almost beginner.

Following advice from you generous people, I have chosen a project that 
interests me, to develop some knowledge of python.

My projest is a simulation of a biological population.
I have a base class and a simulation function, which uses instances of 
the class.
This, after many months of work and lots of advice, now seems to work 
well. It generates sensible data and when I write a small test program 
it gives sensible output.

Now I want to learn to use unittest.
I have written a unittest class which works OK.
But the problem I have is that because I use the random module to 
populate my initial arrays, my data is not strictly predictable even 
though I am using seed(0). So the tests return many *fails* because the 
numbers are not exactly correct, although they are all rather close, 
consistent with the sigma value I have chosen for the spread of my 
population. I do of course use *almostEqual* and not *Equal*.
So, I would be very grateful for guidance. How does one proceed in this 
case? Should I simply create an arbitrary array or value to input into 
the function that I want to test?

I would be grateful for any guidance.

--
Sydney
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor