Re: [sage-devel] Regression testing

Robert Bradshaw Mon, 25 Oct 2010 20:58:09 -0700

On Mon, Oct 25, 2010 at 8:39 PM, David Roe <r...@math.harvard.edu> wrote:
> I think if you set both number and repeat to 1 in sage.misc.sage_timeit, it
> will only run once (though I could be wrong).


Yes, though it'd probably be both cheap and valuable to run fast
commands more than once (but less than the default timit parameters,
unless this is explicitly a timing run).

> We should think about a way to automate uploading of timing data if someone
> doesn't have MongoDB installed.  For example, we could have the test script
> which ran doctests have the option of sending an e-mail somewhere.  Or make
> pymongo standard in Sage.

With the way the lmfdb group is going, it may make sense to make
PyMongo a standard package (despite being so easy to install
manually). A simple http server accepting json data wouldn't be too
hard to throw up either, now that we've entered the realm of running a
service (mongod).

- Robert

> On Mon, Oct 25, 2010 at 23:26, Robert Bradshaw
> <rober...@math.washington.edu> wrote:
>>
>> On Mon, Oct 25, 2010 at 11:54 AM, William Stein <wst...@gmail.com> wrote:
>> >> Also, I was talking to Craig Citro about this and he had the
>> >> interesting idea of creating some kind of a "test object" which would
>> >> be saved and then could be run into future versions of Sage and re-run
>> >> in. The idea of saving the tests that are run, and then running the
>> >> exact same tests (rather than worrying about correlation  of files and
>> >> tests) will make catching regressions much easier.
>> >
>> > Hi,
>> >
>> > Wow, that's an *extremely* good idea!  Nice work, Craig.
>> > Basically, we could have one object that has:
>> >
>> >    (a) list of tests that got run.
>> >    (b) for each of several machines and sage versions:
>> >            - how long each test took
>> >
>> > Regarding (a), this gets extracted from the doctests somehow for
>> > starters, though could have some other tests thrown if we want.
>> >
>> > I could easily imagine storing the above as a single entry in a
>> > MongoDB collection (say):
>> >
>> >   {'tests':[ordered list of input blocks of code that could be
>> > extracted from doctests],
>> >    'timings':[{'machine':'sage.math.washington.edu',
>> > 'version':'sage-4.6.alpha3', 'timings':[a list of floats]},
>> >                   {'machine':'bsd.math.washington.edu',
>> > 'version':'sage-4.5.3', 'timings':[a list of floats]}]
>> >
>> > Note that the ordered list of input blocks could stored using GridFS,
>> > since it's bigger than 4MB:
>> >
>> > wst...@sage:~/build/sage-4.6.alpha3/devel/sage$ sage -grep "sage:" > a
>> > wst...@sage:~/build/sage-4.6.alpha3/devel/sage$ ls -lh a
>> > -rw-r--r-- 1 wstein wstein 9.7M 2010-10-25 11:41 a
>> > wst...@sage:~/build/sage-4.6.alpha3/devel/sage$ wc -l a
>> > 133579 a
>> >
>> > Alternatively, the list of input blocks could be stored in its own
>> > collection, which would just get named by the tests field:
>> >
>> >    {'tests':'williams_test_suite_2010-10-25'}
>> >
>> > The latter is nice, since it would make it much easier to make a web
>> > app that allows for browsing through
>> > the timing results, e.g,. sort them by longest to slowest, and easily
>> > click to get the input that took a long time.
>> >
>> > Another option:  have exactly one collection for each test suite, and
>> > have all other data be in that collection:
>> >
>> > Collection name: "williams_test_suite-2010-10-25"
>> >
>> > Documents:
>> >
>> >  * A document with a unique id, starting at 0, for each actual test
>> >       {'id':0, 'code':'factor(2^127+1)'}
>> >
>> >  * A document for each result of running the tests on an actual
>> > platform:
>> >       {'machine':'bsd.math.washington.edu', 'version':'sage-4.5.3',
>> > 'timings':{0:1.3, 1:0.5,...} }
>> > Here, the timings are stored as a mapping from id's to floats.
>>
>> +1. My only hesitance with this is that it requires either an internet
>> connection or mongodb to participate, both of which are "optional"
>> features of Sage. Of course, people could store their timings locally
>> as lists of dicts, and then optionally upload to mongodb (requiring
>> only pymongo, a much ligher dependency, or even via a json front end).
>>
>> > I think timing should all be done using the "timeit" modulo,  since
>> > that is the Python standard module designed for exactly the purpose of
>> > benchmarking. That ends the discussion about CPU time versus walltime,
>> > etc.
>> >
>> > The architecture of separating out the tests from the
>> > timing/recording/web_view framework is very nice, because it makes it
>> > easy to add in additional test suites.  E.g., I've had companies ask
>> > me: "Could paid support include the statement: 'in all future releases
>> > of Sage, the following commands will run in at most the following
>> > times on the following hardware'?"  And they have specific commands
>> > they care about.
>>
>> OTOH, I think that generating timing data as part of a standard test
>> run could be very valuable. This precludes the use of timeit in
>> general for long-ish running tests. Not that the two ideas are
>> incompatible.
>>
>> - Robert
>>
>> --
>> To post to this group, send an email to sage-devel@googlegroups.com
>> To unsubscribe from this group, send an email to
>> sage-devel+unsubscr...@googlegroups.com
>> For more options, visit this group at
>> http://groups.google.com/group/sage-devel
>> URL: http://www.sagemath.org
>
> --
> To post to this group, send an email to sage-devel@googlegroups.com
> To unsubscribe from this group, send an email to
> sage-devel+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/sage-devel
> URL: http://www.sagemath.org
>

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

Re: [sage-devel] Regression testing

Reply via email to