[sage-devel] Re: Regression testing

2010-10-26 Thread Johan S. R. Nielsen
+1 to the idea of time testing doctests. I would use them all the time
for regression testing and when testing improvements; not the least
for my own Sage "library".

It seems to me that only very rarely it would be interesting to look
at other's timing tests, and so, I don't really think the extra work
required in implementing and maintaining a solution which collects
everyone's timing centralised is justifiable. However, +1 for the idea
of looking for an existing Python Open Source project or contacting
others for starting a new one.

> Also, I was talking to Craig Citro about this and he had the
> interesting idea of creating some kind of a "test object" which would
> be saved and then could be run into future versions of Sage and re-run
> in. The idea of saving the tests that are run, and then running the
> exact same tests (rather than worrying about correlation  of files and
> tests) will make catching regressions much easier.

Not to go off topic, but am I the only one bothered about the - at
least theoretical - overhead of searching for, extracting and parsing
every single doctest each time I want to run them? Does anyone know
how much overhead is actually hidden there (timewise)? If it is the
least bit significant, we could think about extending the above
excellent idea of test objects to be what is _always_ being tested
when doctesting. And then just adding a flag -extract_doctest to ./
sage for rebuilding the test objects only whenever I know there has
been a change.

Cheers,
Johan

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org


[sage-devel] Re: Regression testing

2010-10-25 Thread Nick Alexander
> One could modify local/bin/sage-doctest to allow the option of changing each
> doctest by wrapping it in a "timeit()" call.  This would then generate a
> timing datum for each doctest line.

I did this, a long long time ago.  Not clear whether it was ever
merged.  See:

http://trac.sagemath.org/sage_trac/ticket/3476

Nick

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org


[sage-devel] Re: Regression testing

2010-10-25 Thread Donald Alan Morrison


On Oct 25, 4:23 pm, Mitesh Patel  wrote:
>>[...]
> On 10/25/2010 01:54 PM, William Stein wrote:
> >   * A document with a unique id, starting at 0, for each actual test
> >        {'id':0, 'code':'factor(2^127+1)'}
>
> >   * A document for each result of running the tests on an actual platform:
> >        {'machine':'bsd.math.washington.edu', 'version':'sage-4.5.3',
> > 'timings':{0:1.3, 1:0.5,...} }
> > Here, the timings are stored as a mapping from id's to floats.
>
> This last option seems most "natural" to me, though identical inputs
> that appear in multiple suites would generally(?) get different ids in
> the collections.  Would it be better to use a hash of the 'code' for the
> 'id', or can the database automatically ensure that different ids imply
> different inputs?

http://www.mongodb.org/display/DOCS/Indexes#Indexes-UniqueIndexes

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org


[sage-devel] Re: Regression testing

2010-10-25 Thread Donald Alan Morrison


On Oct 25, 2:47 pm, "Dr. David Kirkby" 
wrote:
> On 10/25/10 04:50 PM, Donald Alan Morrison wrote:
> > On Oct 25, 8:19 am, David Kirkby  wrote:
> >> Getting a checksum of each doctest would be easy. I suggest we use:
> >> $ cksum sometest.py  | awk '{print $1}'
> >> because that will be totally portable across all platforms. 'cksum' is
> >> 32-bit checksum that's part of the POSIX standard and the algorithm is
> >> defined. So there's no worry about whether one has an md5 program, and
> >> if so what its called.
> >http://docs.python.org/library/hashlib.html#module-hashlib
>
> > Python's standard library "hashlib" contains both MD5 and SHA1 Message
> > Digests.
>
> > Their advantage over the checksum (CRC) algorithm, is that the output
> > digest changes dramatically when only 1 input bit changes.
>
> I'm not convinced it's important how many bits change in the output if the 
> input
> changes by one bit.

http://selenic.com/pipermail/mercurial/2009-April/025526.html
http://mercurial.selenic.com/wiki/Nodeid

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org


Re: [sage-devel] Re: Regression testing

2010-10-25 Thread Dr. David Kirkby

On 10/25/10 04:50 PM, Donald Alan Morrison wrote:



On Oct 25, 8:19 am, David Kirkby  wrote:

Getting a checksum of each doctest would be easy. I suggest we use:

$ cksum sometest.py  | awk '{print $1}'

because that will be totally portable across all platforms. 'cksum' is
32-bit checksum that's part of the POSIX standard and the algorithm is
defined. So there's no worry about whether one has an md5 program, and
if so what its called.


http://docs.python.org/library/hashlib.html#module-hashlib

Python's standard library "hashlib" contains both MD5 and SHA1 Message
Digests.

Their advantage over the checksum (CRC) algorithm, is that the output
digest changes dramatically when only 1 input bit changes.



I'm not convinced it's important how many bits change in the output if the input 
changes by one bit.


But if Python has it, then by all means use that.

I was thinking it would be fairly trivial to process the ptestlong.log file to 
get a set of times and checksums. I think I could do it in a 20-30 line shell 
script.


But really we need CPU time for doctests to make this useful.

IMHO, it would be good if the output from the doctests shows real time, CPU 
time, and actual time/date. Then we could correlate failures with system logs, 
to see if we have run out of swap space or similar.


dave

--
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org


[sage-devel] Re: Regression testing

2010-10-25 Thread Donald Alan Morrison


On Oct 25, 8:19 am, David Kirkby  wrote:
> Getting a checksum of each doctest would be easy. I suggest we use:
>
> $ cksum sometest.py  | awk '{print $1}'
>
> because that will be totally portable across all platforms. 'cksum' is
> 32-bit checksum that's part of the POSIX standard and the algorithm is
> defined. So there's no worry about whether one has an md5 program, and
> if so what its called.

http://docs.python.org/library/hashlib.html#module-hashlib

Python's standard library "hashlib" contains both MD5 and SHA1 Message
Digests.

Their advantage over the checksum (CRC) algorithm, is that the output
digest changes dramatically when only 1 input bit changes.

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org


[sage-devel] Re: Regression testing

2010-10-21 Thread koffie
This would be a good addition to the sage developement process
indeed.

Before going trough all the work of implementing this stuff, it might
be wise to first look what's already out there. It might prevent you
from doing double work, or give inspiration on how to do it. There
seems to be a package wich already does some performance testing in
the form of pyUnitPerf: http://sourceforge.net/projects/pyunitperf/ .
There is also a coding recipe that tries to make timing results
compareble across machines by comparing test times to a performance
test of the machine: 
http://code.activestate.com/recipes/440700-performance-testing-with-a-pystone-measurement-dec/

Since we're probably not the only ones wanting this sort of testing
possibility it might be interesting to maybe try to collaborate on
this with programmers of other (open source) python projects and make
a standalone project of this, in this way we wouldn't have to maintain
it all by our own and is entirely in the open source spirit.

- Maarten Derickx

On Oct 21, 6:55 am, Robert Bradshaw 
wrote:
> On Wed, Oct 20, 2010 at 5:33 PM, David Roe  wrote:
> > There are a number of tickets in trac about performance regressions in
> > Sage.  I'm sure there are far more performance regressions which we don't
> > know about because nobody noticed.
>
> > As someone writing library code, it's generally not obvious that one is
> > about to introduce a performance regression (otherwise you'd probably not do
> > it).  There have been instances where I've wondered whether a change I was
> > making might have unintended consequences for performance, but testing all
> > the ways in which this could manifest is laborious.
>
> > Consequently, I've been thinking recently about how to detect performance
> > regressions automatically.  There are really two parts to the problem:
> > gathering timing data on the Sage library, and analyzing that data to
> > determine if regression have occurred (and how serious they are).
>
> +1. This has been lacking for a long time.
>
>
>
>
>
> > Data gathering:
>
> > One could modify local/bin/sage-doctest to allow the option of changing each
> > doctest by wrapping it in a "timeit()" call.  This would then generate a
> > timing datum for each doctest line.
> > * these timings vary from run to run (presumably due to differing levels of
> > load on the machine).  I don't know how to account for this, but usually
> > it's a fairly small effect (on the order of 10% error).
> > * if you're testing against a previous version of sage, the doctest
> > structure will have changed because people wrote more doctests.  And doctest
> > lines depend on each other: you define variables that are used in later
> > lines.  So inserting a line could make timings of later lines incomparable
> > to the exact same line without the inserted line.  We might be able to parse
> > the lines and check that various objects are actually the same (across
> > different versions of sage, so this would require either a version-invariant
> > hash or saving in one version, loading in the other and comparing.  And you
> > would have to do that for each variable that occurs in the line), but that
> > seems to be getting too complicated...
>
> > Many users will only have one version of sage installed.  And even with
> > multiple versions, you need somewhere to PUT the data in order to compare to
> > later.
>
> > One way of handling these problems is to create a relational database to put
> > timings in.  This could also be interesting for benchmarketing purposes: we
> > could have timings on various machines, and we highlight performance
> > improvements, in addition to watching for performance regressions.
>
> > So, here's a first draft for a database schema to put timing data into.
> > I've included a description of each table, along with a description of
> > columns I thought were non-obvious.  I'm definitely interested in
> > suggestsion for improving this schema.
>
> > Table: Hosts
> > # computer information; including identifying data to determine when running
> > on same host
> > col: id
> > col: identifying_data # some way to uniquely identify the computer on which
> > a test is being run. Presumably the output of some unix function, but I
> > don't know what.
>
> > Table: Sage_version
> > # a table giving each existing version of Sage an id
> > # the ids for official sage releases should be consistent across databases;
> > # users can also create their own temporary versions which use a different
> > block of id numbers.
> > # when a new version is created, the Files, Doctest_blocks and Doctest_lines
> > tables are updated
> > col: id
> > col: version_name # string defining the version
> > col: prev_version_id # versions should be totally ordered.  Since we want to
> > reserve some id space for official sage versions and other space for users,
> > this can't be done by just the numerical id.
>
> > Table: Files
> > # a table for all the files in sage that have doctests
>