Re: [Tutor] unittest with random population data
[ I've taken this discussion back to the tutor list. - Cameron ] On 01Jun2015 18:14, Sydney Shall wrote: On 31/05/2015 03:00, Cameron Simpson wrote: You say that your results are "all rather close, consistent with the sigma value I have chosen for the spread of my population". I would advocate making some "contraint" tests that verify this property for _any_ input data set. Then you can run with random and _changing_ input data sets to verify that your code produces the expected _kind_ of results with many data sets. [...] range class which stores contiguous ranges efficiently (a sequence of (low,high) pairs). It has a few add/remove operations which are meant to maintain that sequence on ordered minimal form. cutting and merging adjacent ranges is very easy to get wrong, very sensitive to off-by-one logic errors. So my tests for this class include some random tests which do random unpredictable add/remove operations, and run a consistency check on the object after each operation. This gives me good odds of exercising some tricky sequence which I have not considered explicitly myself. You can see the test suite here: https://bitbucket.org/cameron_simpson/css/src/tip/lib/python/cs/range_tests.py I have studied the material you indicated and it has been most helpful. I think that I have understood the principle involved, but I have had some problem implementing it. The range tests are mostly clear to me but there is one aspect I cannot follow. You use in this suite imports from Range, including Range, overlap, spans and Span. Are these programs that you have written? If so, are they specific to your set up or are they generic? If so, is it possible to obtain these programs? The "cs.range" is a module of my own. As you might imagine, cs.range_tests has the entire purpose of testing cs.range. cs.range is here: https://bitbucket.org/cameron_simpson/css/src/tip/lib/python/cs/range.py Feel free. Since it in turn has some dependencies the easiest way to get it is to use "pip" to install it, as pip will also fetch and insteall its dependencies. But if you just want something to read, fetch and enjoy. I have established a very primitive test suite based on your cs.range >notions and it works fine, but it would be better, I am sure, to do it >properly. There are many other people whose test writing ability surpasses mine. But you're welcome to learn anything you can from mine, and to ask questions. Finally, I have one comment for the respected Moderator, if he is not out on a walk in the highlands in this cold and wet weather. I have taken the liberty of raising my problem here rather than elsewhere, because I have observed that the biological and bio-medical community, who always come late to new notions, is now rapidly discovering python. A great deal of work in these fields involve either stochastic simulations or statistical problems of analysis. The latter are more or less straight-forward, but the simulations are not. You might ask a separate question on the python-l...@python.org about simulations. It has a wider audience than the tutor list and may well include people doing simulation work, or who know where to look. Thanks for all the help. You people are a model of how we could perhaps civilize humanity. Nah. We might all be striving to be a model of how humanity might be when civilised though... Cheers, Cameron Simpson You my man are a danger to society and should be taken out of society for all our sakes. As to what is done to you once removed I couldn't care less. - Roy G. Culley, Unix Systems Administrator ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] unittest with random population data
On 01/06/15 18:14, Sydney Shall wrote: Finally, I have one comment for the respected Moderator, if he is not out on a walk in the highlands in this cold and wet weather. No, that's on tomorrow's schedule :-) I have taken the liberty of raising my problem here rather than elsewhere, because I have observed that the biological and bio-medical community, who always come late to new notions, is now rapidly discovering python. A great deal of work in these fields involve either stochastic simulations or statistical problems of analysis. The latter are more or less straight-forward, but the simulations are not. I have no problem with this kind of material on tutor because, although the list is for "beginners to Python", that includes those with much experience in other languages and who may expect to use TDD from the beginning. So TDD techniques using standard library modules is a valid topic even though it may be too advanced for those beginners who are new to programming. We need to balance coverage to cater for both categories of "beginner". -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] unittest with random population data
On 31/05/2015 03:00, Cameron Simpson wrote: On 30May2015 12:16, Sydney Shall wrote: Following advice from you generous people, I have chosen a project >that interests me, to develop some knowledge of python. My projest is a simulation of a biological population. I have a base class and a simulation function, which uses instances of the class. This, after many months of work and lots of advice, now seems to work well. It generates sensible data and when I write a small test program it gives sensible output. Now I want to learn to use unittest. I have written a unittest class which works OK. But the problem I have is that because I use the random module to populate my initial arrays, my data is not strictly predictable even though I am using seed(0). So the tests return many *fails* because the numbers are not exactly correct, although they are all rather close, consistent with the sigma value I have chosen for the spread of my population. I do of course use *almostEqual* and not *Equal*. First of all, several people have posted suggestions for getting identical results on every run. However, there is another approach, which you might consider. (And use in addition, not inseadt of, the reproducable suggestions). It is all very well to have a unit test that runs exactly the same with a test set of data - it lets you have confidence that algorithm changes do not change the outcome. But on for that data set. You say that your results are "all rather close, consistent with the sigma value I have chosen for the spread of my population". I would advocate making some "contraint" tests that verify this property for _any_ input data set. Then you can run with random and _changing_ input data sets to verify that your code produces the expected _kind_ of results with many data sets. So you would have one test which ran with a fixed data set which confirms preidctable unchanging results. And you have other tests with run with randomly chosen data and confirms that outcomes fall within the parameters you expect. You can apply those checks ("outcome in range") to both sets of tests. As an exmaple, I have a few classes which maintain data structures which are sensitive to boundary conditions. The glaring example is a numeric range class which stores contiguous ranges efficiently (a sequence of (low,high) pairs). It has a few add/remove operations which are meant to maintain that sequence on ordered minimal form. cutting and merging adjacent ranges is very easy to get wrong, very sensitive to off-by-one logic errors. So my tests for this class include some random tests which do random unpredictable add/remove operations, and run a consistency check on the object after each operation. This gives me good odds of exercising some tricky sequence which I have not considered explicitly myself. You can see the test suite here: https://bitbucket.org/cameron_simpson/css/src/tip/lib/python/cs/range_tests.py It has a bunch of explicit specific tests up the top, and then the random consistency test down the bottom as "test30random_set_equivalence". Cheers, Cameron Simpson MS-Word is Not a document exchange format - Jeff Goldberg http://www.goldmark.org/netrants/no-word/attach.html Cameron, Thanks for your most helpful reply. I have studied the material you indicated and it has been most helpful. I think that I have understood the principle involved, but I have had some problem implementing it. The range tests are mostly clear to me but there is one aspect I cannot follow. You use in this suite imports from Range, including Range, overlap, spans and Span. Are these programs that you have written? If so, are they specific to your set up or are they generic? If so, is it possible to obtain these programs? I have established a very primitive test suite based on your cs.range notions and it works fine, but it would be better, I am sure, to do it properly. Finally, I have one comment for the respected Moderator, if he is not out on a walk in the highlands in this cold and wet weather. I have taken the liberty of raising my problem here rather than elsewhere, because I have observed that the biological and bio-medical community, who always come late to new notions, is now rapidly discovering python. A great deal of work in these fields involve either stochastic simulations or statistical problems of analysis. The latter are more or less straight-forward, but the simulations are not. Thanks for all the help. You people are a model of how we could perhaps civilize humanity. -- Sydney ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] unittest with random population data
On 02/06/2015 07:59, Steven D'Aprano wrote: Please keep your replies on the tutor list, so that others may offer advice, and learn from your questions. Thanks, Steve On Mon, Jun 01, 2015 at 06:03:08PM +0100, Sydney Shall wrote: On 31/05/2015 00:41, Steven D'Aprano wrote: On Sat, May 30, 2015 at 12:16:01PM +0100, Sydney Shall wrote: I have written a unittest class which works OK. But the problem I have is that because I use the random module to populate my initial arrays, my data is not strictly predictable even though I am using seed(0). Please show us how you populate your arrays, because what you describe sounds wrong. Seeding to the same value should give the same sequence of values: py> import random py> random.seed(0) py> a = [random.random() for i in range(10**6)] py> random.seed(0) py> b = [random.random() for i in range(10**6)] py> a == b True Thank you for the advice Steven. I was of course aware that I had to use random.seed(0), which I had done. I was puzzled by the fact that it did not give me reprocibly results, which it did when I was learning python. But because you drew attention to the problem, I have looked at again. I surmised that perhaps I had put the seed statement in the wrong place. I have tried several places, but I always get the sam spread of results. Perhaps, to help get advice I should explain that I populate a a list thus: self.ucc = np.random.normal(self.mean_ucc, self.sigma_ucc, 200) This does give me list of 200 slightly different numbers. The mean_ucc is always 15.0 and the sigma value is always 3.75. The actual mean and sigma of the random numbers is checked that it is within 5.0% of 15.0 and 3.75 respectively. Following your advice I did a little test. I repeated a little test program that I have written, which gives me sensible and proper results. I repeated the test 8 times and a then noted a useful output result. When I plot the actual mean of the population used against the output result I chose, I obtain a perfect straight line, which I should. Now I still think that I am using the random.seed(0) either incorrectly or in the wrong place. If there is any other information that might clarify my problem, I will be grateful to be told. I would be most grateful for any guidance you may have, indicating where I should look for what I suspect is a beginner's error. -- Sydney Apolgies. I thought that I had dione so. I will be more careeful infuture. -- Sydney ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] unittest with random population data
On 30May2015 12:16, Sydney Shall wrote: Following advice from you generous people, I have chosen a project >that interests me, to develop some knowledge of python. My projest is a simulation of a biological population. I have a base class and a simulation function, which uses instances of the class. This, after many months of work and lots of advice, now seems to work well. It generates sensible data and when I write a small test program it gives sensible output. Now I want to learn to use unittest. I have written a unittest class which works OK. But the problem I have is that because I use the random module to populate my initial arrays, my data is not strictly predictable even though I am using seed(0). So the tests return many *fails* because the numbers are not exactly correct, although they are all rather close, consistent with the sigma value I have chosen for the spread of my population. I do of course use *almostEqual* and not *Equal*. First of all, several people have posted suggestions for getting identical results on every run. However, there is another approach, which you might consider. (And use in addition, not inseadt of, the reproducable suggestions). It is all very well to have a unit test that runs exactly the same with a test set of data - it lets you have confidence that algorithm changes do not change the outcome. But on for that data set. You say that your results are "all rather close, consistent with the sigma value I have chosen for the spread of my population". I would advocate making some "contraint" tests that verify this property for _any_ input data set. Then you can run with random and _changing_ input data sets to verify that your code produces the expected _kind_ of results with many data sets. So you would have one test which ran with a fixed data set which confirms preidctable unchanging results. And you have other tests with run with randomly chosen data and confirms that outcomes fall within the parameters you expect. You can apply those checks ("outcome in range") to both sets of tests. As an exmaple, I have a few classes which maintain data structures which are sensitive to boundary conditions. The glaring example is a numeric range class which stores contiguous ranges efficiently (a sequence of (low,high) pairs). It has a few add/remove operations which are meant to maintain that sequence on ordered minimal form. cutting and merging adjacent ranges is very easy to get wrong, very sensitive to off-by-one logic errors. So my tests for this class include some random tests which do random unpredictable add/remove operations, and run a consistency check on the object after each operation. This gives me good odds of exercising some tricky sequence which I have not considered explicitly myself. You can see the test suite here: https://bitbucket.org/cameron_simpson/css/src/tip/lib/python/cs/range_tests.py It has a bunch of explicit specific tests up the top, and then the random consistency test down the bottom as "test30random_set_equivalence". Cheers, Cameron Simpson MS-Word is Not a document exchange format - Jeff Goldberg http://www.goldmark.org/netrants/no-word/attach.html ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] unittest with random population data
On Sat, May 30, 2015 at 12:16:01PM +0100, Sydney Shall wrote: > I have written a unittest class which works OK. > But the problem I have is that because I use the random module to > populate my initial arrays, my data is not strictly predictable even > though I am using seed(0). Please show us how you populate your arrays, because what you describe sounds wrong. Seeding to the same value should give the same sequence of values: py> import random py> random.seed(0) py> a = [random.random() for i in range(10**6)] py> random.seed(0) py> b = [random.random() for i in range(10**6)] py> a == b True -- Steve ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] unittest with random population data
In a message of Sat, 30 May 2015 12:16:01 +0100, Sydney Shall writes: >MAC OSX 10.10.3 >Enthought Python 2.7 > >I am an almost beginner. > >Following advice from you generous people, I have chosen a project that >interests me, to develop some knowledge of python. >My projest is a simulation of a biological population. >I have a base class and a simulation function, which uses instances of >the class. >This, after many months of work and lots of advice, now seems to work >well. It generates sensible data and when I write a small test program >it gives sensible output. >Now I want to learn to use unittest. >I have written a unittest class which works OK. >But the problem I have is that because I use the random module to >populate my initial arrays, my data is not strictly predictable even >though I am using seed(0). So the tests return many *fails* because the >numbers are not exactly correct, although they are all rather close, >consistent with the sigma value I have chosen for the spread of my >population. I do of course use *almostEqual* and not *Equal*. >So, I would be very grateful for guidance. How does one proceed in this >case? Should I simply create an arbitrary array or value to input into >the function that I want to test? >I would be grateful for any guidance. > >-- >Sydney You can mock your random number generator function to return something that isn't random for the purposes of testing. Mock is part of unittest for Python 3.3 or later, but since you are on 2.7 you will have to import mock as a separate library. This nice blog post http://fgimian.github.io/blog/2014/04/10/using-the-python-mock-library-to-fake-regular-functions-during-tests/ gives a whole slew of examples for how to use mock to mock the random number generator function os.urandom . But you can use it to mock any function you like. If you have trouble getting it to work, come back with code. It is tricky the first time you mock anything, to make sure you put the mock in the correct place, but once you get the hang of this it is dirt simple. Happy hacking, Laura ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] unittest with random population data
Sydney Shall wrote: > MAC OSX 10.10.3 > Enthought Python 2.7 > > I am an almost beginner. > > Following advice from you generous people, I have chosen a project that > interests me, to develop some knowledge of python. > My projest is a simulation of a biological population. > I have a base class and a simulation function, which uses instances of > the class. > This, after many months of work and lots of advice, now seems to work > well. It generates sensible data and when I write a small test program > it gives sensible output. > Now I want to learn to use unittest. > I have written a unittest class which works OK. > But the problem I have is that because I use the random module to > populate my initial arrays, my data is not strictly predictable even > though I am using seed(0). So the tests return many *fails* because the > numbers are not exactly correct, although they are all rather close, > consistent with the sigma value I have chosen for the spread of my > population. I do of course use *almostEqual* and not *Equal*. > So, I would be very grateful for guidance. How does one proceed in this > case? Should I simply create an arbitrary array or value to input into > the function that I want to test? > I would be grateful for any guidance. With the same input your program should produce exactly the same output. Are you entering data into dicts? Try to set the PYTHONHASHSEED environment variable to get reproducible results. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] unittest with random population data
MAC OSX 10.10.3 Enthought Python 2.7 I am an almost beginner. Following advice from you generous people, I have chosen a project that interests me, to develop some knowledge of python. My projest is a simulation of a biological population. I have a base class and a simulation function, which uses instances of the class. This, after many months of work and lots of advice, now seems to work well. It generates sensible data and when I write a small test program it gives sensible output. Now I want to learn to use unittest. I have written a unittest class which works OK. But the problem I have is that because I use the random module to populate my initial arrays, my data is not strictly predictable even though I am using seed(0). So the tests return many *fails* because the numbers are not exactly correct, although they are all rather close, consistent with the sigma value I have chosen for the spread of my population. I do of course use *almostEqual* and not *Equal*. So, I would be very grateful for guidance. How does one proceed in this case? Should I simply create an arbitrary array or value to input into the function that I want to test? I would be grateful for any guidance. -- Sydney ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor