[sage-devel] Re: Regression testing
+1 to the idea of time testing doctests. I would use them all the time for regression testing and when testing improvements; not the least for my own Sage "library". It seems to me that only very rarely it would be interesting to look at other's timing tests, and so, I don't really think the extra work required in implementing and maintaining a solution which collects everyone's timing centralised is justifiable. However, +1 for the idea of looking for an existing Python Open Source project or contacting others for starting a new one. > Also, I was talking to Craig Citro about this and he had the > interesting idea of creating some kind of a "test object" which would > be saved and then could be run into future versions of Sage and re-run > in. The idea of saving the tests that are run, and then running the > exact same tests (rather than worrying about correlation of files and > tests) will make catching regressions much easier. Not to go off topic, but am I the only one bothered about the - at least theoretical - overhead of searching for, extracting and parsing every single doctest each time I want to run them? Does anyone know how much overhead is actually hidden there (timewise)? If it is the least bit significant, we could think about extending the above excellent idea of test objects to be what is _always_ being tested when doctesting. And then just adding a flag -extract_doctest to ./ sage for rebuilding the test objects only whenever I know there has been a change. Cheers, Johan -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Regression testing
> One could modify local/bin/sage-doctest to allow the option of changing each > doctest by wrapping it in a "timeit()" call. This would then generate a > timing datum for each doctest line. I did this, a long long time ago. Not clear whether it was ever merged. See: http://trac.sagemath.org/sage_trac/ticket/3476 Nick -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Regression testing
On Oct 25, 4:23 pm, Mitesh Patel wrote: >>[...] > On 10/25/2010 01:54 PM, William Stein wrote: > > * A document with a unique id, starting at 0, for each actual test > > {'id':0, 'code':'factor(2^127+1)'} > > > * A document for each result of running the tests on an actual platform: > > {'machine':'bsd.math.washington.edu', 'version':'sage-4.5.3', > > 'timings':{0:1.3, 1:0.5,...} } > > Here, the timings are stored as a mapping from id's to floats. > > This last option seems most "natural" to me, though identical inputs > that appear in multiple suites would generally(?) get different ids in > the collections. Would it be better to use a hash of the 'code' for the > 'id', or can the database automatically ensure that different ids imply > different inputs? http://www.mongodb.org/display/DOCS/Indexes#Indexes-UniqueIndexes -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Regression testing
On Oct 25, 2:47 pm, "Dr. David Kirkby" wrote: > On 10/25/10 04:50 PM, Donald Alan Morrison wrote: > > On Oct 25, 8:19 am, David Kirkby wrote: > >> Getting a checksum of each doctest would be easy. I suggest we use: > >> $ cksum sometest.py | awk '{print $1}' > >> because that will be totally portable across all platforms. 'cksum' is > >> 32-bit checksum that's part of the POSIX standard and the algorithm is > >> defined. So there's no worry about whether one has an md5 program, and > >> if so what its called. > >http://docs.python.org/library/hashlib.html#module-hashlib > > > Python's standard library "hashlib" contains both MD5 and SHA1 Message > > Digests. > > > Their advantage over the checksum (CRC) algorithm, is that the output > > digest changes dramatically when only 1 input bit changes. > > I'm not convinced it's important how many bits change in the output if the > input > changes by one bit. http://selenic.com/pipermail/mercurial/2009-April/025526.html http://mercurial.selenic.com/wiki/Nodeid -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
Re: [sage-devel] Re: Regression testing
On 10/25/10 04:50 PM, Donald Alan Morrison wrote: On Oct 25, 8:19 am, David Kirkby wrote: Getting a checksum of each doctest would be easy. I suggest we use: $ cksum sometest.py | awk '{print $1}' because that will be totally portable across all platforms. 'cksum' is 32-bit checksum that's part of the POSIX standard and the algorithm is defined. So there's no worry about whether one has an md5 program, and if so what its called. http://docs.python.org/library/hashlib.html#module-hashlib Python's standard library "hashlib" contains both MD5 and SHA1 Message Digests. Their advantage over the checksum (CRC) algorithm, is that the output digest changes dramatically when only 1 input bit changes. I'm not convinced it's important how many bits change in the output if the input changes by one bit. But if Python has it, then by all means use that. I was thinking it would be fairly trivial to process the ptestlong.log file to get a set of times and checksums. I think I could do it in a 20-30 line shell script. But really we need CPU time for doctests to make this useful. IMHO, it would be good if the output from the doctests shows real time, CPU time, and actual time/date. Then we could correlate failures with system logs, to see if we have run out of swap space or similar. dave -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Regression testing
On Oct 25, 8:19 am, David Kirkby wrote: > Getting a checksum of each doctest would be easy. I suggest we use: > > $ cksum sometest.py | awk '{print $1}' > > because that will be totally portable across all platforms. 'cksum' is > 32-bit checksum that's part of the POSIX standard and the algorithm is > defined. So there's no worry about whether one has an md5 program, and > if so what its called. http://docs.python.org/library/hashlib.html#module-hashlib Python's standard library "hashlib" contains both MD5 and SHA1 Message Digests. Their advantage over the checksum (CRC) algorithm, is that the output digest changes dramatically when only 1 input bit changes. -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Regression testing
This would be a good addition to the sage developement process indeed. Before going trough all the work of implementing this stuff, it might be wise to first look what's already out there. It might prevent you from doing double work, or give inspiration on how to do it. There seems to be a package wich already does some performance testing in the form of pyUnitPerf: http://sourceforge.net/projects/pyunitperf/ . There is also a coding recipe that tries to make timing results compareble across machines by comparing test times to a performance test of the machine: http://code.activestate.com/recipes/440700-performance-testing-with-a-pystone-measurement-dec/ Since we're probably not the only ones wanting this sort of testing possibility it might be interesting to maybe try to collaborate on this with programmers of other (open source) python projects and make a standalone project of this, in this way we wouldn't have to maintain it all by our own and is entirely in the open source spirit. - Maarten Derickx On Oct 21, 6:55 am, Robert Bradshaw wrote: > On Wed, Oct 20, 2010 at 5:33 PM, David Roe wrote: > > There are a number of tickets in trac about performance regressions in > > Sage. I'm sure there are far more performance regressions which we don't > > know about because nobody noticed. > > > As someone writing library code, it's generally not obvious that one is > > about to introduce a performance regression (otherwise you'd probably not do > > it). There have been instances where I've wondered whether a change I was > > making might have unintended consequences for performance, but testing all > > the ways in which this could manifest is laborious. > > > Consequently, I've been thinking recently about how to detect performance > > regressions automatically. There are really two parts to the problem: > > gathering timing data on the Sage library, and analyzing that data to > > determine if regression have occurred (and how serious they are). > > +1. This has been lacking for a long time. > > > > > > > Data gathering: > > > One could modify local/bin/sage-doctest to allow the option of changing each > > doctest by wrapping it in a "timeit()" call. This would then generate a > > timing datum for each doctest line. > > * these timings vary from run to run (presumably due to differing levels of > > load on the machine). I don't know how to account for this, but usually > > it's a fairly small effect (on the order of 10% error). > > * if you're testing against a previous version of sage, the doctest > > structure will have changed because people wrote more doctests. And doctest > > lines depend on each other: you define variables that are used in later > > lines. So inserting a line could make timings of later lines incomparable > > to the exact same line without the inserted line. We might be able to parse > > the lines and check that various objects are actually the same (across > > different versions of sage, so this would require either a version-invariant > > hash or saving in one version, loading in the other and comparing. And you > > would have to do that for each variable that occurs in the line), but that > > seems to be getting too complicated... > > > Many users will only have one version of sage installed. And even with > > multiple versions, you need somewhere to PUT the data in order to compare to > > later. > > > One way of handling these problems is to create a relational database to put > > timings in. This could also be interesting for benchmarketing purposes: we > > could have timings on various machines, and we highlight performance > > improvements, in addition to watching for performance regressions. > > > So, here's a first draft for a database schema to put timing data into. > > I've included a description of each table, along with a description of > > columns I thought were non-obvious. I'm definitely interested in > > suggestsion for improving this schema. > > > Table: Hosts > > # computer information; including identifying data to determine when running > > on same host > > col: id > > col: identifying_data # some way to uniquely identify the computer on which > > a test is being run. Presumably the output of some unix function, but I > > don't know what. > > > Table: Sage_version > > # a table giving each existing version of Sage an id > > # the ids for official sage releases should be consistent across databases; > > # users can also create their own temporary versions which use a different > > block of id numbers. > > # when a new version is created, the Files, Doctest_blocks and Doctest_lines > > tables are updated > > col: id > > col: version_name # string defining the version > > col: prev_version_id # versions should be totally ordered. Since we want to > > reserve some id space for official sage versions and other space for users, > > this can't be done by just the numerical id. > > > Table: Files > > # a table for all the files in sage that have doctests >